WO2021238455A1 - Procédé et dispositif de traitement de données, et support de stockage lisible par ordinateur - Google Patents

Procédé et dispositif de traitement de données, et support de stockage lisible par ordinateur Download PDF

Info

Publication number
WO2021238455A1
WO2021238455A1 PCT/CN2021/086644 CN2021086644W WO2021238455A1 WO 2021238455 A1 WO2021238455 A1 WO 2021238455A1 CN 2021086644 W CN2021086644 W CN 2021086644W WO 2021238455 A1 WO2021238455 A1 WO 2021238455A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
abnormal
tested
sequence
segment
Prior art date
Application number
PCT/CN2021/086644
Other languages
English (en)
Chinese (zh)
Inventor
蒋勇
彭鑫
叶德忠
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2021238455A1 publication Critical patent/WO2021238455A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Definitions

  • the embodiments of the present invention relate to, but are not limited to, the field of information processing technology, and in particular, to a data processing method, device, and computer-readable storage medium.
  • embodiments of the present invention provide a data processing method, device, and computer-readable storage medium, which at least solve the above technical problems to a certain extent.
  • an embodiment of the present invention provides a data processing method, including obtaining a target data sequence; obtaining a first abnormal data segment in the target data sequence; obtaining a first data search space in the target data sequence Acquire a second abnormal data segment corresponding to the first abnormal data segment in the first data search space according to the first abnormal data segment; mark the second abnormal data segment.
  • an embodiment of the present invention also provides a device, including: a memory, a processor, and a computer program stored in the memory and running on the processor.
  • a device including: a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, the above The data processing method of the second aspect is described.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute the above-mentioned data processing method.
  • FIG. 1 is a schematic diagram of a system architecture platform for executing a data processing method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a data processing method provided by an embodiment of the present invention.
  • FIG. 3 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 4 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 5 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 6 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 7 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 8 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 9 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 10 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 11 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 12 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 13 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 14 is a flowchart of a data processing method provided by another embodiment of the present invention.
  • FIG. 15 is a flowchart of a heuristic algorithm provided by an embodiment of the present invention.
  • 16 is a flowchart of a heuristic algorithm provided by another embodiment of the present invention.
  • FIG. 17 is a main flow diagram of a data processing method provided by another embodiment of the present invention.
  • time series indicator data most of the time series indicator data will have certain periodic characteristics, and many recurring abnormal data tend to appear in the same position in different cycles, and The shape of the abnormal data will show a certain similarity.
  • most of the abnormal data can be attributed to several types of abnormalities with similar characteristics, and the number of truly unique abnormal data is relatively small.
  • there will be relatively similar abnormal data between similar and different time series indicator data For example, a network element has abnormally high central processing unit (CPU) utilization during a certain period of time. This situation may also appear in the CPU utilization timing data of another network element that undertakes similar services.
  • CPU central processing unit
  • the embodiments of the present invention provide a data processing method, device, and computer-readable storage medium.
  • the first abnormality is obtained in the target data sequence.
  • the data segment and the first data search space enable the first abnormal data segment to be used as an abnormal data segment template, so that the corresponding second abnormal data segment can be obtained and marked in the first data search space according to the abnormal data segment template, That is, the purpose of labeling other abnormal data segments in the target data sequence is achieved.
  • FIG. 1 is a schematic diagram of a system architecture platform for executing a data processing method provided by an embodiment of the present invention.
  • the system architecture platform includes a memory 110 and a processor 120, where the memory 110 and the processor 120 may be connected by a bus or in other ways.
  • the connection by a bus is taken as an example.
  • the memory 110 can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory 110 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory 110 includes memories remotely arranged with respect to the processor 120, and these remote memories may be connected to the system architecture platform through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • system architecture platform can be applied to various network controllers or network managers, which is not specifically limited in this embodiment.
  • network controller or network manager with the system architecture platform can be applied to various network systems, for example, can be applied to 3G communication network systems, LTE communication network systems, 5G communication network systems, and subsequent evolved mobile communication network systems Etc., this embodiment does not specifically limit this.
  • FIG. 1 does not constitute a limitation to the embodiment of the present invention, and may include more or less components than those shown in the figure, or a combination of certain components, or different components. Component arrangement.
  • the processor 120 can call the data processing program stored in the memory 110 to execute the data processing method.
  • FIG. 2 is a flowchart of a data processing method provided by an embodiment of the present invention.
  • the data processing method includes but is not limited to step S100, step S200, step S300, step S400, and step S500.
  • Step S100 Obtain the target data sequence.
  • the target data sequence may be time-series indicator data or other sequence data.
  • the other sequence data may be non-time-series indicator data such as business type sequence data or business quantity sequence data.
  • the target data sequence can be automatically obtained by the device with the above-mentioned system architecture platform in the network, or it can be obtained by entering into the device with the above-mentioned system architecture platform through manual operation, which is not specifically limited in this embodiment.
  • Step S200 Acquire the first abnormal data segment in the target data sequence.
  • the first abnormal data segment is a data segment with abnormal data in the target data sequence.
  • the first abnormal data segment in the target data sequence can be manually determined and selected, and then entered into the system with the above-mentioned system architecture.
  • the device can obtain the first abnormal data segment, or it can be stored in the memory of the device so that the device can obtain the first abnormal data segment from the memory.
  • the necessary basic conditions can be provided for labeling the remaining abnormal data segments in the target data sequence in the subsequent steps.
  • abnormal data often occurs at one or more consecutive time points.
  • the time point when the abnormal data occurs is called the abnormal time point, and the data corresponding to the set of abnormal time points
  • the segment is called the abnormal data segment.
  • An abnormal data segment may last a long time (that is, it contains many abnormal time points). Therefore, for an abnormal data segment, at least the following characteristics are required: starting time point, ending time point, including at least 3 time points, abnormal data There is no overlap point in time between segments.
  • Step S300 Acquire a first data search space in the target data sequence.
  • the data search space is a part of candidate abnormal data extracted from the target data sequence by a machine learning method.
  • the data search space By obtaining the data search space, most of the normal data can be filtered out, and similar data segments appearing in these normal data can be prevented from being misjudged as similar abnormal data and searched out, thereby improving the search accuracy of similar abnormal data;
  • the search range of similar abnormal data can also be narrowed, thereby improving the search efficiency.
  • the first abnormal data segment may be a data segment outside the first data search space, or may be a data segment in the first data search space, which is not specifically limited in this embodiment.
  • the first abnormal data segment is obtained in the first data search space because the preliminary extraction of abnormal data has been carried out when the first data search space is obtained. , Can make the acquisition of the first abnormal data segment more accurate and effective.
  • Step S400 Acquire a second abnormal data segment corresponding to the first abnormal data segment in the first data search space according to the first abnormal data segment.
  • the first abnormal data segment can be used as an abnormal data segment template, and the data segment in the first data search space can be compared with the first abnormal data segment to find out the data segment in the first data search space.
  • the data segment that is the same or similar to the first abnormal data segment that is, the data segment that is the same or similar to the first abnormal data segment, is the second abnormal data segment. Therefore, by taking the first abnormal data segment as the template of the abnormal data segment and obtaining the second abnormal data segment corresponding to the first abnormal data segment in the first data search space, it is possible to find the abnormal data in the target data sequence. Therefore, for time series index data with a large amount of abnormal data but not many abnormal types, compared with the traditional method of manually searching for abnormal data, this embodiment can improve the efficiency of finding abnormal data in the data, thereby saving Human resources and time resources.
  • all the data in the first data search space can be regarded as a data segment, and the dynamic time warping (Dynamic Time Warping, DTW) algorithm can be used to calculate the similarity.
  • the first data can be determined A second abnormal data segment corresponding to the first abnormal data segment in a data search space.
  • the data in the first data search space can be divided into multiple data segments with the same length as the first abnormal data segment, and methods such as Euclidean distance or Pearson correlation coefficient can be used to compare the first abnormal data segment and the first abnormal data segment. Similarity calculations are performed on multiple data segments in the data search space to determine the second abnormal data segment corresponding to the first abnormal data segment in the first data search space.
  • Step S500 Mark the second abnormal data segment.
  • the second abnormal data segment corresponding to the first abnormal data segment when the second abnormal data segment corresponding to the first abnormal data segment is found in the first data search space, the second abnormal data segment can be marked, so as to facilitate the formation of abnormal label data.
  • training data sets which can be used in machine learning technologies such as deep learning technologies.
  • the data processing method uses the above-mentioned steps S100, S200, S300, S400, and S500, the first abnormal data segment and the first data search space are acquired in the target data sequence, so that the The first abnormal data segment can be used as an abnormal data segment template, so that the corresponding second abnormal data segment can be obtained and marked in the first data search space according to the abnormal data segment template, that is, to realize the detection of other abnormal data in the target data sequence.
  • the purpose of labeling data segments Therefore, for time series index data with a large amount of abnormal data but not many types of abnormalities, compared with the traditional manual labeling of abnormal data segments, the data processing method of this embodiment can improve the abnormalities in the data. The efficiency of data labeling can save human resources and time resources.
  • step S300 includes but is not limited to the following steps:
  • Step S310 Obtain the first abnormal characteristic value of the target data sequence
  • Step S320 Determine the first data position corresponding to the first abnormal characteristic value in the target data sequence according to the first abnormal characteristic value
  • Step S330 Acquire the first data search space according to the first data location.
  • abnormal data is data that deviates from most of the data in the data set.
  • the first abnormal characteristic value in this embodiment refers to the deviation value between the abnormal data and the normal data.
  • the first data position corresponding to the first abnormal characteristic value in the target data sequence may be determined according to the first abnormal characteristic value, that is, according to The first abnormal feature value determines the location of the abnormal data in the target data sequence. For example, when a data deviates from most of the data by greater than or equal to the first abnormal feature value, it can be determined that the location of the data is where an abnormal data is located. Location.
  • the first data position may include a start abnormal position, a middle abnormal position, and an end abnormal position. After the start abnormal position, the middle abnormal position, and the end abnormal position are determined, the first data search space can be obtained. .
  • LOF Local Outlier Factor
  • DBSCAN DBSCAN algorithm
  • isolation forest Isolation Forest, iForest
  • LOF Local Outlier Factor
  • iForest isolation forest
  • the isolated forest algorithm Take the isolated forest algorithm as an example to illustrate. First, build an iTree (tree) based on the target data sequence, and use the data in the target data sequence as the tree's sample data, and then use the first abnormal feature value to compare the sample data Perform a binary division to distinguish the sample data that meets the first abnormal feature value and the sample data that does not meet the first abnormal feature value to form two data sets, and then repeat the above process for these two data sets. Until the data can no longer be divided or the maximum height of the tree is reached, the first data position corresponding to the first abnormal characteristic value in the target data sequence can be obtained, so that the first data search space can be determined according to the first data position.
  • LOF Local Outlier Factor
  • DBSCAN DBSCAN algorithm
  • isolation forest Isolation Forest, iForest
  • step S310 includes but is not limited to the following steps:
  • Step S311 Obtain the first baseline prediction data of the target data sequence
  • Step S312 Obtain a first abnormal characteristic value according to the deviation value between the first baseline prediction data and the data in the target data sequence.
  • the baseline prediction data of the normal data in the target data sequence may be obtained first, and then based on the data in the target data sequence and the baseline prediction data The deviation value (that is, the absolute difference) between, obtains the first abnormal characteristic value.
  • different baseline prediction methods can be used to obtain baseline prediction data of normal data in the target data sequence, which is not specifically limited in this embodiment.
  • the baseline prediction method can use a difference method, a moving average method, and a weighting method.
  • Time series baseline forecasting methods such as moving average method, exponential weighted moving average method, differential moving average autoregressive method or three-time exponential smoothing method can also use regression methods such as random forest and XGBooste (Xtreme Gradient Boosting).
  • a variety of first abnormal feature values can be obtained by using a variety of baseline prediction methods, and corresponding steps are performed to obtain the first data search space by synthesizing different first abnormal feature values, which can facilitate the acquisition of the first data search space The accuracy and generalization ability.
  • the first abnormal characteristic value can be obtained by the following formula:
  • R is a first anomaly value
  • X i is the target data in the sequence
  • P i is the first baseline prediction target data sequence.
  • step S400 includes but is not limited to the following steps:
  • Step S410 Determine a third data segment in the first data search space
  • Step S420 Perform similarity calculation on the first abnormal data segment and the third data segment to obtain a first similarity metric value corresponding to the third data segment;
  • Step S430 Determine the corresponding third data segment as the second abnormal data segment according to the first similarity metric value.
  • the similarity measurement algorithm is used to calculate the similarity between the first abnormal data segment and the third data segment to obtain the first similarity measurement value corresponding to the third data segment.
  • the similarity metric value indicates that the first abnormal data segment is similar to the third data segment, and it can be determined that the corresponding third data segment is the second abnormal data segment (that is, the remaining abnormal data segments in the first data search space).
  • this embodiment can improve the abnormalities in the data.
  • the efficiency of data labeling can save human resources and time resources.
  • the number of the third data segment may be one or multiple, which is not specifically limited in this embodiment.
  • the number of the third data segment is one, all the data in the first data search space can be determined as the third data segment, or part of the continuous data in the first data search space can be determined as the third data segment.
  • the embodiment is not specifically limited; when the number of the third data segment is multiple, the data in the first data search space can be divided into multiple data segments of equal length, or the data in the first data search space can be divided It is divided into multiple data segments of unequal length, which is not specifically limited in this embodiment.
  • the calculation of the similarity between the first abnormal data segment and the third data segment can be achieved by using different similarity measurement algorithms. For example, for multiple third data segments of equal length to each other, Euclidean distance, Pearson correlation coefficient, or Spearman rank correlation coefficient can be used to calculate the similarity between the first abnormal data segment and the third data segment. ; For another example, for multiple third data segments with unequal lengths, the DTW algorithm or an improved fast DTW algorithm can be used to calculate the similarity between the first abnormal data segment and the third data segment.
  • the specific implementation manner for calculating the similarity between the first abnormal data segment and the third data segment can be appropriately selected according to actual use needs, and this embodiment does not specifically limit it.
  • the improved fast DTW algorithm can include FastDTW algorithm, SparseDTW algorithm, LB_Keogh algorithm, and LB_Improved algorithm, etc.
  • FastDTW algorithm can reduce the search space and data abstraction methods by limiting the accuracy difference. Next, the computational complexity is reduced.
  • step S430 includes but is not limited to the following steps:
  • Step S431 When the first similarity metric value is less than the preset threshold, it is determined that the third data segment corresponding to the first similarity metric value is the second abnormal data segment.
  • the first similarity metric value indicates the degree of similarity between the first abnormal data segment and the third data segment, and the smaller the value of the first similarity metric value is, it indicates that the first abnormal data segment and the third data segment are Therefore, when the first similarity measure value is less than the preset threshold, it can be determined that the first abnormal data segment and the third data segment have a higher degree of similarity, so that the first similarity measure can be determined
  • the third data segment corresponding to the value is the second abnormal data segment.
  • the preset threshold can be appropriately selected according to the similarity measurement algorithm used. For example, for the Euclidean distance and the DTW algorithm, different preset thresholds can be used, which is not specifically limited in this embodiment. .
  • step S430 may include but is not limited to the following steps:
  • Step S432 Obtain a first similarity metric value whose value is less than a preset threshold
  • Step S433 Sort the first similarity measure values whose values are less than the preset threshold value from small to large to adjust the sorting of the corresponding third data segment;
  • Step S434 Determine that the first N third data segments are second abnormal data segments, where N is greater than or equal to 1.
  • the number of the acquired first similarity metric values corresponding to the third data segment is also more than two.
  • the FastDTW algorithm can be used to compare each third data segment in the first data search space with the first abnormal data segment to calculate the similarity metric value of each third data segment, and then All third data segments are sorted according to similarity metric values to obtain several third data segments with a higher degree of similarity to the first abnormal data segment, so that the first data segment that needs to be labeled can be determined based on these third data segments.
  • Abnormal data segment can be used to compare each third data segment in the first data search space with the first abnormal data segment to calculate the similarity metric value of each third data segment, and then All third data segments are sorted according to similarity metric values to obtain several third data segments with a higher degree of similarity to the first abnormal data segment, so that the first data segment that needs to be labeled can be determined based on these third data segments.
  • the optimal value of N may be different. For example, if the value of N is too small, some abnormal data segments will be missed and not marked, and if the value of N is too large, some abnormal data segments with relatively low similarity may be identified, resulting in accuracy The problem of falling. Therefore, the value of N needs to be appropriately selected according to the actual application situation. If you want to select the value of N more accurately, you can calculate the AUC value by establishing the curve of accuracy and recall to obtain the best value of N. value.
  • the accuracy rate refers to the proportion of correctly labeled abnormal data segments
  • the recall rate refers to the proportion of manually labeled abnormal data segments that are correctly labeled
  • AUC Re Under Curve
  • step S100 may include but is not limited to the following steps:
  • Step S110 Obtain multiple data sequences to be tested
  • Step S120 performing clustering processing on a plurality of data sequences to be tested to obtain a target data category
  • Step S130 Determine a target data sequence from each target data category.
  • the number of data sequences to be tested collected from the network is very large. If the first abnormal data segment in each data sequence to be tested is manually determined, the workload is relatively large. In addition, there may be cases in which the data sequence to be tested does not contain a large amount of similar abnormal data due to the short acquisition time of the data sequence to be tested. Therefore, in this case, obtain the first data sequence in the data sequence to be tested. An abnormal data segment will be more difficult. However, for a data indicator, according to the different resource objects bound to it in the network, many data sequences to be tested can be collected. For example, in a medium-scale network, there will be tens of thousands of port resources.
  • a data index tens of thousands of data sequences to be tested can be collected, and these data sequences to be tested themselves often have certain similarities. For example, it is used to count the traffic timing data of the base station access port deployed in school A and is used to count the traffic timing data of the base station access port deployed in school B. Since the daily life characteristics of school A and school B are similar, then this The two data series to be tested will be relatively similar to a large extent. In the similar data sequence to be tested, the characteristics of the abnormal data will also have certain commonalities. Based on the above situation, the multiple acquired data sequences to be tested can be clustered to obtain the target data category, and then a target data sequence can be determined from each target data category, so as to provide the necessary foundation for the subsequent steps condition.
  • the number of target data classes obtained by clustering multiple data sequences to be tested may be one or multiple, depending on the similarity of the data sequences to be tested, for example, if These multiple data sequences to be tested are relatively similar, then all the data sequences to be tested can be classified as a target data category, and if some of the multiple data sequences to be tested are relatively similar, then you can The multiple data sequences to be tested are divided into multiple target data categories, and each target data category includes a part of the data sequences to be tested.
  • the data processing method may further include the following steps:
  • Step S600 acquiring a second data search space in the remaining data sequences to be tested in each target data category
  • Step S700 Use the first abnormal data segment in the target data sequence to obtain the second abnormal data segment in the second data search space in the remaining data sequence to be tested, respectively.
  • the second abnormal data segment in the target data sequence can be obtained and labeled through the steps in the above-mentioned embodiment.
  • the first abnormal data segment obtained in the target data sequence can be applied to the same target data category. Obtain and label the second abnormal data segment for the rest of the data sequence under test. Therefore, you can first obtain the second data search space from the remaining data sequence under test in each target data class, and then use the data in the target data sequence.
  • the first abnormal data segment obtains the second abnormal data segment from the second data search space in the remaining data sequence to be tested respectively. Since only the first abnormal data segment in the target data sequence can be used to obtain the second abnormal data segment in the second data search space in the remaining data sequences to be tested, it can save to obtain the second abnormal data segment in each of the remaining data sequences to be tested.
  • the second abnormal data segment in each data sequence to be tested can be obtained more concisely and efficiently, so that the labeling efficiency of abnormal data in multiple time series indicator data can be improved.
  • the second data search space in this embodiment and the first data search space in the above embodiment are of the same type of technical features.
  • the difference between the two is only in the different belonging objects, and the first data search space belongs to The target data sequence, and the second data search space belongs to the remaining data sequences to be tested in the same target data category.
  • the second data search space is not described in detail here.
  • step S700 in this embodiment is similar to step S400 in the embodiment shown in FIG.
  • the execution object of step S400 is the first data search space of the target data sequence
  • the execution object of step S700 in this embodiment is the second data search space of the remaining data sequences to be tested in the same target data class.
  • step S700 is not described in detail here.
  • step S700 reference may be made to related explanations of step S400 in the foregoing embodiment.
  • step S600 includes but is not limited to the following steps:
  • Step S610 Obtain the second abnormal characteristic value of the remaining data sequence to be tested in each target data category
  • Step S620 Determine the second data position corresponding to the second abnormal characteristic value in the remaining data sequence to be tested according to the second abnormal characteristic value;
  • Step S630 Obtain the second data search space of the remaining data sequence to be tested according to the second data position.
  • the second abnormal feature value and the second data location in this embodiment, and the first abnormal feature value and the first data location in the foregoing embodiment belong to the same type of technical features, respectively.
  • the only difference is that the attribution objects are different.
  • the first abnormal feature value and the first data location are both attributable to the target data sequence, while the second abnormal feature value and the second data location are both attributable to the remaining data to be tested in the same target data category sequence.
  • the second abnormal feature value and the second data location are not described in detail here.
  • the second abnormal feature value and the second data location please refer to the first abnormal feature in the above embodiment. Explanation of the value and the position of the first data.
  • step S610, step S620, and step S630 in this embodiment are similar to step S310, step S320, and step S330 in the embodiment shown in FIG. 3, and they have similar technical principles and Technical effect, the difference between the two is only in the execution target.
  • the execution target of step S310, step S320 and step S330 is the target data sequence
  • the execution target of step S610, step S620 and step S630 in this embodiment is The remaining data sequences to be tested in the same target data class.
  • step S610, step S620, and step S630 are not described here in detail.
  • step S610, step S620, and step S630 you can refer to step S310, step S320, and step S330 in the above embodiment. Related explanations.
  • step S610 includes but is not limited to the following steps:
  • Step S611 Obtain the second baseline prediction data of the remaining data sequence to be tested in each target data category
  • Step S612 Obtain the second abnormal characteristic value of the remaining data sequence to be tested according to the deviation value of the second baseline prediction data and the data in the remaining data sequence to be tested.
  • the second baseline prediction data in this embodiment and the first baseline prediction data in the foregoing embodiment belong to the same type of technical features.
  • the difference between the two is only that the attribution object is different, and the first baseline prediction The data belongs to the target data sequence, and the second baseline prediction data belongs to the remaining data sequences to be tested in the same target data category.
  • the second baseline prediction data is not described in detail here.
  • step S611 and step S612 in this embodiment are similar to step S311 and step S312 in the embodiment shown in FIG. 4, and they have similar technical principles and technical effects.
  • the only difference is that the execution objects are different.
  • the execution objects of step S311 and step S312 are the target data sequence, while the execution objects of step S611 and step S612 in this embodiment are the remaining data sequences to be tested in the same target data category. .
  • step S611 and step S612 are not described in detail here.
  • step S700 includes but is not limited to the following steps:
  • Step S710 respectively determine the fourth data segment in the second data search space in the remaining data sequence to be tested
  • Step S720 Perform similarity calculation on the first abnormal data segment in the target data sequence and the fourth data segment in the remaining data sequences to be tested to obtain a second similarity metric value corresponding to the fourth data segment;
  • Step S730 Determine the corresponding fourth data segment in the remaining data sequence to be tested as the second abnormal data segment in the remaining data sequence to be tested according to the second similarity metric value.
  • the fourth data segment and the second similarity metric value in this embodiment, and the third data segment and the first similarity metric value in the foregoing embodiment belong to the same type of technical features, and the difference between the two The only difference is that the attribution object is different.
  • the third data segment belongs to the first data search space of the target data sequence
  • the first similarity measure value corresponds to the third data segment
  • the fourth data segment belongs to the same target data category.
  • the second similarity metric value corresponds to the fourth data segment.
  • the fourth data segment and the second similarity measure value are not described in detail here.
  • the fourth data segment and the second similarity measure value refer to the third data segment in the above embodiment. Explanation and explanation of the first similarity measure.
  • step S710, step S720, and step S730 in this embodiment are similar to step S410, step S420, and step S430 in the embodiment shown in FIG. 5, and they have similar technical principles and The technical effect is that the difference between the two is only in the execution objects.
  • the execution objects of step S410, step S420, and step S430 are the first data search space of the target data sequence.
  • steps S710, S720, and S720 are executed.
  • the execution object of step S730 is the second data search space of the remaining data sequences to be tested in the same target data category.
  • step S710, step S720, and step S730 are not described in detail here.
  • steps S710, step S720, and step S730 you can refer to steps S410, step S420, and step S430 in the above embodiment. Related explanations.
  • step S730 includes but is not limited to the following steps:
  • Step S731 When the second similarity metric value is less than the preset threshold, it is determined that the corresponding fourth data segment in the remaining data sequence to be tested is the second abnormal data segment in the remaining data sequence to be tested.
  • the second similarity metric value represents the degree of similarity between the first abnormal data segment and the fourth data segment
  • the higher the degree of similarity between the two therefore, when the second similarity measure value is less than the preset threshold, it can be determined that the first abnormal data segment and the fourth data segment have a higher degree of similarity, and therefore the second similarity measure can be determined
  • the fourth data segment corresponding to the value is the second abnormal data segment.
  • the preset threshold can be appropriately selected according to the similarity measurement algorithm used. For example, for the Euclidean distance and the DTW algorithm, different preset thresholds can be used, which is not specifically limited in this embodiment. .
  • step S730 may include but is not limited to the following steps:
  • Step S732 respectively acquiring a second similarity metric value corresponding to the remaining data sequence to be tested and the value is less than a preset threshold
  • Step S733 sorting the second similarity measure values whose values are less than the preset threshold value from small to large, so as to respectively adjust the sorting of the corresponding fourth data segments in the remaining data sequences to be tested;
  • Step S734 Determine the first N fourth data segments as the second abnormal data segments in the remaining data sequences to be tested, where N is greater than or equal to 1.
  • step S732, step S733, and step S734 in this embodiment are similar to step S432, step S433, and step S434 in the embodiment shown in FIG. 6, and they have similar technical principles and Technical effect, the difference between the two is only in the execution target.
  • the execution target of step S432, step S433, and step S434 is the target data sequence
  • the execution target of step S732, step S733, and step S734 in this embodiment is The remaining data sequences to be tested in the same target data class.
  • step S732, step S733 and step S734 will not be described in detail here.
  • step S732, step S733 and step S734 please refer to step S432, step S433 and step S434 in the above embodiment.
  • step S120 may include but is not limited to the following steps:
  • Step S121 Perform data preprocessing on multiple data sequences to be tested, respectively, to obtain multiple first preprocessed data sequences;
  • Step S122 Perform baseline extraction processing on the multiple first pre-processed data sequences, respectively, to obtain multiple second pre-processed data sequences;
  • Step S123 clustering a plurality of second pre-processed data sequences according to similarity, to obtain a target data category.
  • data preprocessing may be performed on the multiple data sequences to be tested respectively to obtain multiple first preprocessed data sequences, and then the multiple data sequences can be preprocessed.
  • the first preprocessed data sequences are respectively subjected to baseline extraction processing to obtain multiple second preprocessed data sequences, and then the multiple second preprocessed data sequences are clustered according to the similarity, so as to obtain the target data category.
  • the abnormal data segment labeling processing for each data sequence to be tested can be transformed into the abnormal data segment labeling processing for each target data category , which can reduce the processing complexity and processing time, thereby improving the efficiency of labeling abnormal data in the data.
  • performing baseline extraction processing on the first pre-processed data sequence can smooth out abnormal parts and noise parts in the data sequence to be tested, thereby improving the accuracy of the similarity measurement between the data sequences to be tested.
  • the baseline extraction processing in step S122 in this embodiment has a similar technical principle to the step of obtaining baseline prediction data using the baseline prediction method in the embodiment shown in FIG.
  • the relevant explanations of performing the baseline extraction processing in S122 reference may be made to the relevant explanations of using the baseline prediction method to obtain baseline prediction data in the embodiment shown in FIG.
  • step S121 may include but is not limited to the following steps:
  • Step S1211 performing missing value filling processing on the multiple data sequences to be tested respectively, to obtain multiple filling data sequences
  • Step S1212 Perform data standardization processing on the multiple filling data sequences, respectively, to obtain multiple first preprocessed data sequences.
  • the data sequence to be tested collected from the network may have missing values of varying degrees due to various reasons. These missing values will not only cause the length of each data sequence to be tested to be different, resulting in some Similarity measurement algorithms are difficult to use and will affect the accuracy of the baseline extraction process.
  • this embodiment first performs data filling on these missing values to obtain a filling data sequence, and then performs data standardization processing on the filling data sequence to obtain The first preprocessed data sequence.
  • a linear interpolation filling method may be used to perform the missing value filling processing.
  • the linear interpolation filling method can smooth the waveform of the data sequence to be measured, thereby facilitating the execution of the baseline extraction processing. For example, for a time series indicator data, the specific location of the missing value can be determined based on the continuity in time. After the specific location of the missing value is determined, the specific data that needs to be filled can be obtained based on the data before and after the location of the missing value. Numerical value, for example, the average value of the preceding and following data can be used as the specific numerical value to be filled.
  • the linear interpolation filling method belongs to an algorithm commonly used in the art, and therefore, the specific principle of the algorithm will not be repeated here.
  • performing data standardization processing on the filling data sequence can transform and map the data sequence to be measured to a specific interval, thereby helping to eliminate the dimensional difference between different data sequences to be measured, so that they can be put together Compare the similarity.
  • the Z-Score method may be used for data standardization processing, and the calculation formula is as follows:
  • x′ i is the first preprocessed data sequence
  • x i is the data sequence to be tested
  • Is the mean value of the data series to be tested
  • is the standard deviation of the data series to be tested.
  • the clustering of multiple second pre-processed data sequences according to similarity in step S123 may specifically include, but is not limited to, the following steps:
  • Step S1231 using the DBSCAN algorithm to cluster a plurality of second pre-processed data sequences according to the similarity; among them, the parameters of the DBSCAN algorithm include the distance function, the threshold of the number of neighborhoods, and the threshold of the neighborhood distance; the result of the DBSCAN algorithm includes the number of categories and Abnormal proportions.
  • DBSCAN algorithm is one of the commonly used clustering algorithms, and the DBSCAN algorithm does not need to determine the number of cluster centers in advance.
  • the key parameters of DBSCAN algorithm include distance function, neighborhood number threshold and neighborhood distance threshold, while the result of DBSCAN algorithm includes classification number and abnormal proportion.
  • the Euclidean distance function can be used in this embodiment; for the number of neighborhood threshold, this embodiment can be set to 4; and for the neighborhood distance threshold, the parameter needs to be dynamically based on the data set. Estimated, and this parameter has a significant impact on the clustering results.
  • the neighborhood distance threshold may be obtained by a heuristic algorithm, where the heuristic algorithm includes but is not limited to the following steps:
  • Step S810 Calculate the similarity between a plurality of second pre-processed data sequences by using a distance function to obtain similarity matrix data;
  • Step S820 Calculate the k-dist distance based on the similarity matrix data to obtain the k-dist sequence
  • Step S830 obtaining an initial distance threshold parameter based on the k-dist sequence
  • Step S840 Adjust the initial distance threshold parameter to obtain the neighborhood distance threshold.
  • the k-dist distance refers to the distance between a data object and its k-th closest object.
  • the similarity between multiple second preprocessing data sequences can be calculated by using, for example, the Euclidean distance function and other distance functions to form similarity matrix data, and then based on the The similarity matrix data calculates the k-dist distance to obtain a k-dist sequence, and then obtains an initial distance threshold parameter based on the k-dist sequence, and then adjusts the initial distance threshold parameter to obtain an appropriate neighborhood distance threshold.
  • the neighborhood distance threshold can be applied to the above embodiment using the DBSCAN algorithm to cluster multiple second preprocessed data sequences according to similarity, Thereby, the target data class can be obtained.
  • initial thresholds such as the maximum distance threshold, minimum length threshold, slope threshold, and slope difference threshold of the neighborhood may be set first, and the above steps are performed after completing the setting of these initial thresholds. S810, step S820, step S830, and step S840.
  • step S820 when step S820 is performed, after calculating the k-dist distance of each k-dist point based on the similarity matrix data, the obtained k-dist distances can be sorted from small to large, and k-dist distances are excluded. -The k-dist point with a dist distance of 0 and the k-dist point with a k-dist distance exceeding the maximum distance threshold of the neighborhood, therefore, the remaining k-dist points constitute a k-dist sequence.
  • step S830 may include but is not limited to the following steps:
  • Step S831 Calculate the slopes of each k-dist point in the k-dist sequence and the two adjacent points before and after, the slopes of the current two adjacent points are less than the preset slope threshold, and the current two adjacent points The slope difference of is smaller than the preset slope difference threshold, and the current k-dist point is determined to be the candidate distance threshold;
  • Step S832 Determine the one with the largest value among the candidate distance thresholds as the initial distance threshold parameter.
  • the relatively flat k-dist point in the k-dist sequence may be first determined as the candidate distance threshold.
  • the specific steps may be: First calculate the slope of each k-dist point in the k-dist sequence and its two adjacent points before and after it.
  • the current k-dist point and the previous adjacent point which can be defined as the left slope
  • the current k-dist point The slope of the point and the next adjacent point (which can be defined as the right slope) is less than the preset slope threshold, and the difference between the left slope and the right slope is less than the preset slope difference threshold, then the current k The -dist point is determined as a candidate distance threshold.
  • these candidate distance thresholds can be sorted from largest to smallest, and then the candidate distance threshold with the maximum value is taken as the initial distance threshold parameter.
  • step S840 may include but is not limited to the following steps:
  • Step S841 obtaining the step length
  • step S842 the initial distance threshold parameter is adjusted according to the step length to obtain the distance adjustment threshold.
  • the distance adjustment threshold obtained in the previous step adjustment is the neighborhood distance threshold.
  • the initial distance threshold parameter can continue to be optimized. It is worth noting that when the initial distance threshold parameter is optimized, it is necessary to keep the classification number unchanged. Under the premise of reducing the abnormal proportion as much as possible.
  • the proportion of abnormalities will continue to decrease, and the number of classifications may also decrease, so you can gradually increase the value of the initial distance threshold parameter in a step-by-step manner to determine the best neighborhood distance Threshold, that is, on the basis of the initial distance threshold parameter, the number of classifications and abnormal proportions are recalculated after each step length is increased, until the number of classifications drops, the step length is stopped increasing, at this time, the previous step can be determined to adjust
  • the obtained distance adjustment threshold is the optimal neighborhood distance threshold.
  • the step length can be set according to empirical values, or it can be set according to the candidate distance threshold. For example, when the step length is set according to the candidate distance threshold, the step length can be set as the candidate distance threshold.
  • the distance threshold is one-tenth of the difference between the maximum distance threshold and the minimum distance threshold, which is not specifically limited in this embodiment.
  • the heuristic algorithm specifically includes the following steps:
  • Step S901 threshold setting.
  • initial thresholds such as the maximum distance threshold, minimum length threshold, slope threshold, and slope difference threshold of the neighborhood are respectively set.
  • Step S902 the sequence similarity matrix is calculated.
  • the similarity between each pair of data sequences is calculated by the distance function to form the similarity matrix data.
  • step S903 the k-dist distance is calculated and sorted.
  • the k-dist distance of each k-dist point is calculated based on the similarity matrix data and sorted from small to large.
  • Step S904 filtering according to the maximum distance threshold.
  • the k-dist points whose k-dist distance is 0 and the k-dist distance points whose k-dist distance exceeds the maximum distance threshold of the neighborhood are excluded.
  • step S905 the k-dist sequence values are taken in order.
  • step S906 it is judged whether the left slope and the right slope are both smaller than the slope threshold.
  • step S907 calculates the slope of each k-dist point and its two adjacent points before and after it, if the slope of the current k-dist point and the previous adjacent point (which can be defined as the left slope) and the current k-dist point and The slope (which can be defined as the right slope) of the latter adjacent point is all less than the preset slope threshold, then step S907 is executed, otherwise, step S905 is executed.
  • step S907 it is determined whether the difference between the left slope and the right slope is less than the slope difference threshold.
  • step S908 when the difference between the left slope and the right slope is less than the slope difference threshold, step S908 is executed, otherwise, step S905 is executed.
  • step S908 the current k-dist point is determined as the candidate distance threshold, and steps S905 to S907 are repeated, and when all candidate distance thresholds are obtained, step S909 is executed.
  • Step S909 After sorting the candidate thresholds in descending order, the largest candidate threshold is taken as the initial distance threshold parameter.
  • step S910 the clustering algorithm is executed to obtain the number of classifications and the abnormal ratio.
  • step S911 a step size is added on the basis of the initial distance threshold parameter, and the clustering algorithm is executed.
  • step S912 it is judged whether the number of classifications has decreased, if so, step S913 is executed, otherwise, step S911 is executed.
  • Step S913 Determine the previous distance threshold as the best distance threshold.
  • step S130 may include but is not limited to the following steps:
  • Step S131 In each target data category, calculate the average sum of distances between each data sequence to be measured and the rest of the data sequences to be measured, and determine the data sequence to be measured corresponding to the smallest value of the average and median distance as the target data sequence .
  • a core data sequence representing the corresponding target data category may be determined for each target data category, that is, from A target data sequence is determined in each target data class.
  • the target data sequence can be determined by the following formula:
  • FIG. 17 is a main flow diagram of the data processing method provided in this example. Based on the main flow diagram shown in FIG. 17, the data processing method specifically includes the following steps:
  • the corresponding abnormal data segment template is automatically obtained according to the core sequence of the classification of each sequence, and the search for similar abnormal segments is completed in the data search space of the sequence to obtain N similar abnormal segments with a higher degree of similarity. These N similar abnormal segments are abnormally marked;
  • another embodiment of the present invention also provides a device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor.
  • the processor and the memory can be connected by a bus or in other ways.
  • the memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory includes a memory remotely arranged with respect to the processor, and these remote memories may be connected to the processor through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the device in this embodiment may include the system architecture platform in the embodiment shown in FIG. 1, and the device in this embodiment and the system architecture platform in the embodiment shown in FIG. 1 belong to the same invention. Concept, so the two have the same implementation principle and technical effect, and will not be detailed here.
  • Steps S100 to S500, method steps S310 to S330 in FIG. 3, method steps S311 to S312 in FIG. 4, method steps S410 to S430 in FIG. 5, method steps S432 to S434 in FIG. 6, and method in FIG. 7 Steps S110 to S130, method steps S600 to S700 in FIG. 8, method steps S610 to S630 in FIG. 9, method steps S611 to S612 in FIG. 10, method steps S710 to S730 in FIG. 11, and method in FIG. 12 Steps S732 to S734, method steps S121 to S123 in FIG. 13, method steps S1211 to S1212 in FIG. 14, method steps S810 to S840 in FIG. 15, and method steps S901 to S913 in FIG.
  • the device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • another embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by a processor or a controller, for example, by The execution of a processor in the foregoing device embodiment may cause the foregoing processor to execute the data processing method in the foregoing embodiment, for example, to execute the method steps S100 to S500 in FIG. 2 and the method steps S310 to S310 in FIG. 3 described above. S330, method steps S311 to S312 in FIG. 4, method steps S410 to S430 in FIG. 5, method steps S432 to S434 in FIG. 6, method steps S110 to S130 in FIG. 7, and method steps S600 to S600 in FIG.
  • the embodiment of the present invention includes: acquiring the target data sequence; acquiring the first abnormal data segment in the target data sequence; acquiring the first data search space in the target data sequence; acquiring the data in the first data search space according to the first abnormal data segment The second abnormal data segment corresponding to the first abnormal data segment; the second abnormal data segment is marked.
  • the first abnormal data segment can be used as an abnormal data segment template, so that the first data search space can be According to the abnormal data segment template, the corresponding second abnormal data segment is acquired and marked, which realizes the purpose of marking other abnormal data segments in the target data sequence. Therefore, compared with the traditional manual marking of abnormal data segments,
  • the solution provided by the embodiment of the present invention can improve the labeling efficiency of abnormal data in the data, thereby saving human resources and time resources.
  • Computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data). Sexual, removable and non-removable media.
  • Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other storage technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
  • a communication medium usually contains computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transmission mechanism, and may include any information delivery medium. .

Abstract

La présente invention concerne un procédé et un dispositif de traitement de données, et un support de stockage lisible par ordinateur. Le procédé de traitement de données consiste à : acquérir une séquence de données cible (S100) ; acquérir un premier segment de données anormal à partir de la séquence de données cible (S200) ; acquérir un premier espace de recherche de données à partir de la séquence de données cible (S300) ; acquérir, selon le premier segment de données anormal et le premier espace de recherche de données, un second segment de données anormal correspondant au premier segment de données anormal (S400) ; et étiqueter le second segment de données anormal (S500).
PCT/CN2021/086644 2020-05-29 2021-04-12 Procédé et dispositif de traitement de données, et support de stockage lisible par ordinateur WO2021238455A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010473617.0 2020-05-29
CN202010473617.0A CN113742387A (zh) 2020-05-29 2020-05-29 数据处理方法、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2021238455A1 true WO2021238455A1 (fr) 2021-12-02

Family

ID=78724518

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/086644 WO2021238455A1 (fr) 2020-05-29 2021-04-12 Procédé et dispositif de traitement de données, et support de stockage lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN113742387A (fr)
WO (1) WO2021238455A1 (fr)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114872290A (zh) * 2022-05-20 2022-08-09 深圳市信润富联数字科技有限公司 一种注塑件的自适应生产异常监测方法
CN115792479A (zh) * 2023-02-08 2023-03-14 东营市建筑设计研究院 一种智能插座的用电智能监测方法及系统
CN115858894A (zh) * 2023-02-14 2023-03-28 温州众成科技有限公司 一种可视化的大数据分析方法
CN116029842A (zh) * 2023-03-28 2023-04-28 北京环球医疗救援有限责任公司 一种医疗保险大数据的清洗去噪方法及系统
CN116331044A (zh) * 2023-05-31 2023-06-27 山东芯演欣电子科技发展有限公司 一种用于直流充电桩的充电数据存储系统
CN116383190A (zh) * 2023-05-15 2023-07-04 青岛场外市场清算中心有限公司 一种海量大数据智能清洗方法及系统
CN116994675A (zh) * 2023-09-28 2023-11-03 佳木斯大学 基于近红外数据的锦灯笼宿萼表皮检测方法
CN117150233A (zh) * 2023-10-30 2023-12-01 广东电网有限责任公司湛江供电局 一种电网异常数据治理方法、系统、设备及介质
CN117196446A (zh) * 2023-11-06 2023-12-08 北京中海通科技有限公司 一种基于大数据的产品风险实时监测平台
CN117476136A (zh) * 2023-12-28 2024-01-30 山东松盛新材料有限公司 一种高纯羧酸酯合成工艺参数优化方法及系统
CN117455127B (zh) * 2023-12-26 2024-03-15 临沂市园林环卫保障服务中心 一种基于智慧园林的植物碳汇动态数据监测系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013218725A (ja) * 2013-06-19 2013-10-24 Hitachi Ltd 異常検知方法及び異常検知システム
CN104636999A (zh) * 2015-01-04 2015-05-20 江苏联宏自动化系统工程有限公司 一种建筑异常用能数据检测方法
CN109882834A (zh) * 2019-03-27 2019-06-14 新奥数能科技有限公司 锅炉设备的运行数据监测方法及装置
CN110558971A (zh) * 2019-08-02 2019-12-13 苏州星空大海医疗科技有限公司 基于单目标及多目标的生成对抗网络心电图异常检测方法
CN111061711A (zh) * 2019-11-28 2020-04-24 同济大学 一种基于数据处理行为的大数据流卸载方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013218725A (ja) * 2013-06-19 2013-10-24 Hitachi Ltd 異常検知方法及び異常検知システム
CN104636999A (zh) * 2015-01-04 2015-05-20 江苏联宏自动化系统工程有限公司 一种建筑异常用能数据检测方法
CN109882834A (zh) * 2019-03-27 2019-06-14 新奥数能科技有限公司 锅炉设备的运行数据监测方法及装置
CN110558971A (zh) * 2019-08-02 2019-12-13 苏州星空大海医疗科技有限公司 基于单目标及多目标的生成对抗网络心电图异常检测方法
CN111061711A (zh) * 2019-11-28 2020-04-24 同济大学 一种基于数据处理行为的大数据流卸载方法和装置

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114872290B (zh) * 2022-05-20 2024-02-06 深圳市信润富联数字科技有限公司 一种注塑件的自适应生产异常监测方法
CN114872290A (zh) * 2022-05-20 2022-08-09 深圳市信润富联数字科技有限公司 一种注塑件的自适应生产异常监测方法
CN115792479A (zh) * 2023-02-08 2023-03-14 东营市建筑设计研究院 一种智能插座的用电智能监测方法及系统
CN115792479B (zh) * 2023-02-08 2023-05-09 东营市建筑设计研究院 一种智能插座的用电智能监测方法及系统
CN115858894A (zh) * 2023-02-14 2023-03-28 温州众成科技有限公司 一种可视化的大数据分析方法
CN116029842A (zh) * 2023-03-28 2023-04-28 北京环球医疗救援有限责任公司 一种医疗保险大数据的清洗去噪方法及系统
CN116029842B (zh) * 2023-03-28 2023-06-20 北京环球医疗救援有限责任公司 一种医疗保险大数据的清洗去噪方法及系统
CN116383190A (zh) * 2023-05-15 2023-07-04 青岛场外市场清算中心有限公司 一种海量大数据智能清洗方法及系统
CN116383190B (zh) * 2023-05-15 2023-08-25 青岛场外市场清算中心有限公司 一种海量金融交易大数据智能清洗方法及系统
CN116331044A (zh) * 2023-05-31 2023-06-27 山东芯演欣电子科技发展有限公司 一种用于直流充电桩的充电数据存储系统
CN116331044B (zh) * 2023-05-31 2023-08-04 山东芯演欣电子科技发展有限公司 一种用于直流充电桩的充电数据存储系统
CN116994675A (zh) * 2023-09-28 2023-11-03 佳木斯大学 基于近红外数据的锦灯笼宿萼表皮检测方法
CN116994675B (zh) * 2023-09-28 2023-12-01 佳木斯大学 基于近红外数据的锦灯笼宿萼表皮检测方法
CN117150233A (zh) * 2023-10-30 2023-12-01 广东电网有限责任公司湛江供电局 一种电网异常数据治理方法、系统、设备及介质
CN117150233B (zh) * 2023-10-30 2024-02-13 广东电网有限责任公司湛江供电局 一种电网异常数据治理方法、系统、设备及介质
CN117196446A (zh) * 2023-11-06 2023-12-08 北京中海通科技有限公司 一种基于大数据的产品风险实时监测平台
CN117196446B (zh) * 2023-11-06 2024-01-19 北京中海通科技有限公司 一种基于大数据的产品风险实时监测平台
CN117455127B (zh) * 2023-12-26 2024-03-15 临沂市园林环卫保障服务中心 一种基于智慧园林的植物碳汇动态数据监测系统
CN117476136A (zh) * 2023-12-28 2024-01-30 山东松盛新材料有限公司 一种高纯羧酸酯合成工艺参数优化方法及系统
CN117476136B (zh) * 2023-12-28 2024-03-15 山东松盛新材料有限公司 一种高纯羧酸酯合成工艺参数优化方法及系统

Also Published As

Publication number Publication date
CN113742387A (zh) 2021-12-03

Similar Documents

Publication Publication Date Title
WO2021238455A1 (fr) Procédé et dispositif de traitement de données, et support de stockage lisible par ordinateur
WO2019232853A1 (fr) Procédé d'apprentissage de modèle chinois, procédé de reconnaissance de modèle chinois, dispositif, appareil et support
CN112418117B (zh) 一种基于无人机图像的小目标检测方法
WO2019232843A1 (fr) Procédé et appareil d'apprentissage de modèle manuscrit, procédé et appareil de reconnaissance d'image manuscrite, et dispositif et support
JP6897749B2 (ja) 学習方法、学習システム、および学習プログラム
WO2019232852A1 (fr) Procédé et appareil pour obtenir un échantillon d'entraînement d'écriture manuscrite, et dispositif et support
CN112579823B (zh) 基于特征融合和增量滑动窗口的视频摘要生成方法及系统
CN110826618A (zh) 一种基于随机森林的个人信用风险评估方法
CN109934077B (zh) 一种图像识别方法和电子设备
CN110717554A (zh) 图像识别方法、电子设备及存储介质
CN113723157B (zh) 一种农作物病害识别方法、装置、电子设备及存储介质
CN112766218A (zh) 基于非对称联合教学网络的跨域行人重识别方法和装置
CN110751191A (zh) 一种图像的分类方法及系统
CN110766075A (zh) 轮胎区域图像比对方法、装置、计算机设备和存储介质
CN113526282A (zh) 一种电梯中长期老化故障诊断方法、装置、介质和设备
CN112185108A (zh) 基于时空特征的城市路网拥堵模式识别方法、设备及介质
CN113870254B (zh) 目标对象的检测方法、装置、电子设备及存储介质
CN113780145A (zh) 精子形态检测方法、装置、计算机设备和存储介质
CN117392484A (zh) 一种模型训练方法、装置、设备及存储介质
CN111428064A (zh) 小面积指纹图像快速索引方法、装置、设备及存储介质
CN109670417A (zh) 指纹识别方法及装置
CN110942089B (zh) 一种基于多级决策的击键识别方法
CN112926670A (zh) 一种基于迁移学习的垃圾分类系统及方法
CN111597934A (zh) 用于为统计应用处理训练数据的系统和方法
CN112738724A (zh) 一种区域目标人群的精准识别方法、装置、设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21811916

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 13.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21811916

Country of ref document: EP

Kind code of ref document: A1