CN117235265A

CN117235265A - Processing system and method for power data file

Info

Publication number: CN117235265A
Application number: CN202311234493.0A
Authority: CN
Inventors: 邹剑; 王珂; 李南; 王坤
Original assignee: Hubei Zhongheng Electric Measurement Technology Co ltd
Current assignee: HUBEI ELECTRIC POWER Co JINGZHOU POWER SUPPLY Co
Priority date: 2023-09-21
Filing date: 2023-09-21
Publication date: 2023-12-15
Anticipated expiration: 2043-09-21
Also published as: CN117235265B

Abstract

The invention provides a processing system and a processing method of an electric power data file, and relates to the technical field of data processing. In the invention, for each power data file to be stored, marking the power data file to be stored as a power data file to be processed; analyzing target power abnormality characterization data corresponding to a power data file to be processed by utilizing a plurality of power data analysis networks; performing first classification processing on a plurality of power data files to be stored based on corresponding target power abnormality characterization data to form at least one first classification set; based on the similarity between the power data files to be stored, respectively carrying out second classification processing in each first classification set to form at least one second classification set corresponding to each first classification set; and respectively storing each second classification set in a classification way. Based on the method, the reliability of classified storage can be improved.

Description

A system and method for processing electric power data files

技术领域Technical field

本发明涉及数据处理技术领域，具体而言，涉及一种电力数据文件的处理系统和方法。The present invention relates to the field of data processing technology, and specifically to a system and method for processing electric power data files.

背景技术Background technique

在数据处理技术的精度越来越高的情况下，数据处理技术的应用场景也不断扩展，例如，可以在电力领域中加以利用，具体来说，对于采集到的或形成的电力数据文件，可以利用数据处理技术进行数据特征分析，以确定出数据之间的相关性或区别性，使得可以进行分类存储，但是，在现有技术中，存在着分类存储的可靠度不佳的问题。As the accuracy of data processing technology becomes higher and higher, the application scenarios of data processing technology are also constantly expanding. For example, it can be used in the field of electric power. Specifically, the collected or formed electric power data files can be Data processing technology is used to perform data feature analysis to determine the correlation or difference between data, so that classified storage can be performed. However, in the existing technology, there is a problem of poor reliability of classified storage.

发明内容Contents of the invention

有鉴于此，本发明的目的在于提供一种电力数据文件的处理系统和方法，以提高分类存储的可靠度。In view of this, the object of the present invention is to provide a power data file processing system and method to improve the reliability of classified storage.

为实现上述目的，本发明实施例采用如下技术方案：In order to achieve the above objects, the embodiments of the present invention adopt the following technical solutions:

一种电力数据文件的处理方法，包括：A method for processing power data files, including:

对于待存储的多个待存储电力数据文件中的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件；For each of the plurality of power data files to be stored, marking the power data file to be stored as a power data file to be processed;

利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据，所述目标电力异常表征数据用于反映所述待处理电力数据文件对应的电力系统的异常状态；Utilize multiple power data analysis networks to analyze the target power abnormality characterization data corresponding to the power data file to be processed, and the target power abnormality characterization data is used to reflect the abnormal state of the power system corresponding to the power data file to be processed;

基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合，每一个所述第一分类集合包括至少一个待存储电力数据文件；Based on the corresponding target power abnormality characterization data, a first classification process is performed on the plurality of power data files to be stored to form at least one first classification set, each of the first classification sets including at least one power data file to be stored. ;

基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合，每一个第二分类集合包括至少一个待存储电力数据文件；Based on the similarity between the power data files to be stored, a second classification process is performed within each of the first classification sets to form at least one second classification set corresponding to each of the first classification sets. The two-category set includes at least one power data file to be stored;

对得到的每一个所述第二分类集合分别进行分类存储。Each obtained second classification set is classified and stored separately.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合的步骤，包括：In some preferred embodiments, in the above method for processing power data files, the plurality of power data files to be stored are subjected to a first classification process based on the corresponding target power abnormality characterization data to form at least a first classification process. The steps for classifying a collection include:

将所述多个待存储电力数据文件中的每两个待存储电力数据文件对应的目标电力异常表征数据进行一致性或相似性分析；Perform consistency or similarity analysis on the target power abnormality characterization data corresponding to each two power data files to be stored in the plurality of power data files to be stored;

将分析出的对应的目标电力异常表征数据一致的待存储电力数据文件或对应的目标电力异常表征数据属于同一个参数区间的待存储电力数据文件，分配到同一个第一分类集合中，以形成至少一个第一分类集合。The analyzed power data files to be stored whose corresponding target power abnormality characterization data are consistent or whose corresponding target power abnormality characterization data belong to the same parameter interval are assigned to the same first classification set to form At least one first category set.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合的步骤，包括：In some preferred embodiments, in the above method for processing power data files, the second classification process is performed within each of the first classification sets based on the similarity between the power data files to be stored, forming The steps of at least one second classification set corresponding to each first classification set include:

对于每一个所述第一分类集合，对该第一分类集合包括的待存储电力数据文件进行数量统计操作，以形成对应的文件数量统计值，并在该文件数量统计值小于或等于预先确定的第一参考值的情况下，将该第一分类集合确定为对应的第二分类集合，以及，在该文件数量统计值大于所述第一参考值的情况下，将该第一分类集合确定为对应的第三分类集合；For each of the first classification sets, a quantitative statistical operation is performed on the power data files to be stored included in the first classified set to form a corresponding statistical value of the number of files, and when the statistical value of the number of files is less than or equal to a predetermined In the case of the first reference value, the first classification set is determined as the corresponding second classification set, and in the case where the file quantity statistical value is greater than the first reference value, the first classification set is determined as The corresponding third classification set;

基于待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合。Based on the similarity between the power data files to be stored, a second classification process is performed within each of the third classification sets to form at least one second classification set corresponding to each of the third classification sets.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述基于待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合的步骤，包括：In some preferred embodiments, in the above method for processing power data files, the second classification process is performed within each of the third classification sets based on the similarity between the power data files to be stored, forming The steps of at least one second classification set corresponding to each third classification set include:

将所述第三分类集合中的每一个待存储电力数据文件进行关键词提取操作，以形成每一个待存储电力数据文件对应的关键词序列，所述关键词序列中的每一个关键词属于针对电力系统领域配置的参考关键词集合；Perform a keyword extraction operation on each power data file to be stored in the third classification set to form a keyword sequence corresponding to each power data file to be stored. Each keyword in the keyword sequence belongs to A collection of reference keywords for power system domain configuration;

分别对对应的所述关键词序列进行特征挖掘处理，以形成所述待存储电力数据文件对应的关键词特征表示；Perform feature mining processing on the corresponding keyword sequences respectively to form a keyword feature representation corresponding to the power data file to be stored;

基于所述关键词特征表示，计算出对应的待存储电力数据文件之间的相似度，并基于所述待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合。Based on the keyword feature representation, the similarity between the corresponding power data files to be stored is calculated, and based on the similarity between the power data files to be stored, it is performed within each of the third classification sets. The second classification process forms at least one second classification set corresponding to each third classification set.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述分别对对应的所述关键词序列进行特征挖掘处理，以形成所述待存储电力数据文件对应的关键词特征表示的步骤，包括：In some preferred embodiments, in the above method for processing power data files, the corresponding keyword sequences are respectively subjected to feature mining processing to form a keyword feature representation corresponding to the power data file to be stored. steps, including:

对于所述关键词序列中的每一个关键词，对该关键词进行嵌入处理，以形成该关键词对应的词嵌入特征表示；以及，基于目标电力数据语料，确定出所述关键词序列中的每一个关键词，在所述关键词序列中是否具有相关关键词，所述相关关键词和对应的所述关键词，在所述目标电力数据语料中的共现概率大于预设概率；For each keyword in the keyword sequence, embedding processing is performed on the keyword to form a word embedding feature representation corresponding to the keyword; and, based on the target power data corpus, determining the keyword in the keyword sequence For each keyword, whether there are related keywords in the keyword sequence, and the co-occurrence probability of the related keywords and the corresponding keywords in the target power data corpus is greater than the preset probability;

将所述关键词序列中的不具有相关关键词的每一个关键词，标记为第一关键词，并将所述关键词序列中的具有相关关键词的每一个关键词，标记为第二关键词，以及，将每一个所述第一关键词的词嵌入特征表示进行标记，以标记为该第一关键词的目标词嵌入特征表示；Each keyword in the keyword sequence that does not have a relevant keyword is marked as a first keyword, and each keyword in the keyword sequence that has a relevant keyword is marked as a second keyword. words, and mark the word embedding feature representation of each first keyword to mark the target word embedding feature representation of the first keyword;

对于每一个所述第二关键词，对该第二关键词对应的相关关键词对应的词嵌入特征表示进行标记处理，以标记为该第二关键词对应的相关词嵌入特征表示，以及，对该相关词嵌入特征表示进行转置操作，以形成该第二关键词对应的转置词嵌入特征表示；For each second keyword, mark the word embedding feature representation corresponding to the related keyword corresponding to the second keyword to mark the related word embedding feature representation corresponding to the second keyword, and, The related word embedding feature representation is subjected to a transposition operation to form a transposed word embedding feature representation corresponding to the second keyword;

分别计算每一个所述第二关键词对应的词嵌入特征表示、对应的转置词嵌入特征表示和相关词嵌入特征表示进行融合操作，以形成每一个所述第二关键词对应的目标词嵌入特征表示；Calculate the word embedding feature representation corresponding to each of the second keywords, the corresponding transposed word embedding feature representation and the related word embedding feature representation and perform a fusion operation to form a target word embedding corresponding to each of the second keywords. Feature representation;

对所述关键词序列中的每一个关键词对应的目标词嵌入特征表示进行拼接操作，以形成所述待存储电力数据文件对应的关键词特征表示。The target word embedding feature representation corresponding to each keyword in the keyword sequence is spliced to form a keyword feature representation corresponding to the power data file to be stored.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述基于所述关键词特征表示，计算出对应的待存储电力数据文件之间的相似度，并基于所述待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合的步骤，包括：In some preferred embodiments, in the above method for processing power data files, the similarity between corresponding power data files to be stored is calculated based on the keyword feature representation, and the similarity between the corresponding power data files to be stored is calculated based on the power to be stored. The step of performing a second classification process on each of the third classification sets based on the similarity between the data files and forming at least one second classification set corresponding to each of the third classification sets includes:

对所述第三分类集合中每一个待存储电力数据文件对应的关键词特征表示进行均值计算，以输出对应的均值关键词特征表示；Perform mean calculation on the keyword feature representation corresponding to each power data file to be stored in the third classification set to output the corresponding mean keyword feature representation;

对于所述第三分类集合中每一个待存储电力数据文件，计算该待存储电力数据文件对应的关键词特征表示和所述均值关键词特征表示之间的余弦相似度，以得到该待存储电力数据文件对应的余弦相似度；For each power data file to be stored in the third classification set, calculate the cosine similarity between the keyword feature representation corresponding to the power data file to be stored and the mean keyword feature representation to obtain the power to be stored Cosine similarity corresponding to the data file;

基于余弦配置的多个连续的相似度区间，对所述第三分类集合中每一个待存储电力数据文件进行第二分类处理，以形成所述第三分类集合对应的至少一个第二分类集合，在所述第二分类集合中，包括的每一个待存储电力数据文件对应的余弦相似度属于同一个相似度区间。Based on multiple continuous similarity intervals configured by cosine, perform a second classification process on each power data file to be stored in the third classification set to form at least one second classification set corresponding to the third classification set, In the second classification set, the cosine similarity corresponding to each included power data file to be stored belongs to the same similarity interval.

在一些优选的实施例中，在上述电力数据文件的处理方法中，所述利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据的步骤，包括：In some preferred embodiments, in the above method for processing a power data file, the step of using multiple power data analysis networks to analyze the target power abnormality characterization data corresponding to the power data file to be processed includes:

利用多个电力数据分析网络对待处理电力数据文件进行特征挖掘操作，以输出对应的多个初始数据特征表示，所述多个电力数据分析网络中的每一个电力数据分析网络用于基于加载到的数据，输出对应的电力异常表征数据，所述待处理电力数据文件属于电力系统的运行文本数据；Utilizing multiple power data analysis networks to perform feature mining operations on the power data files to be processed to output corresponding multiple initial data feature representations, each of the multiple power data analysis networks is used to perform feature mining based on the loaded data, output corresponding power abnormality characterization data, and the power data file to be processed belongs to the operating text data of the power system;

将所述多个初始数据特征表示进行特征表示的融合操作，以形成对应的聚合数据特征表示；Perform a feature representation fusion operation on the plurality of initial data feature representations to form a corresponding aggregated data feature representation;

基于所述聚合数据特征表示，分析出所述待处理电力数据文件对应的目标电力异常表征数据，所述目标电力异常表征数据用于反映所述待处理电力数据文件对应的电力系统的异常状态。Based on the aggregated data feature representation, the target power abnormality characterization data corresponding to the power data file to be processed is analyzed, and the target power abnormality characterization data is used to reflect the abnormal state of the power system corresponding to the power data file to be processed.

本发明实施例还提供一种电力数据文件的处理系统，包括：An embodiment of the present invention also provides a power data file processing system, including:

数据文件标记模块，用于对于待存储的多个待存储电力数据文件中的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件；A data file marking module, configured to mark each of the plurality of power data files to be stored as a power data file to be stored as a power data file to be processed;

电力异常分析模块，用于利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据，所述目标电力异常表征数据用于反映所述待处理电力数据文件对应的电力系统的异常状态；A power anomaly analysis module is used to utilize multiple power data analysis networks to analyze the target power anomaly characterization data corresponding to the power data file to be processed, and the target power anomaly characterization data is used to reflect the corresponding power data file to be processed. Abnormal status of the power system;

第一分类处理模块，用于基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合，每一个所述第一分类集合包括至少一个待存储电力数据文件；A first classification processing module, configured to perform first classification processing on the plurality of power data files to be stored based on the corresponding target power abnormality characterization data to form at least one first classification set, each of the first classification sets Includes at least one power data file to be stored;

第二分类处理模块，用于基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合，每一个第二分类集合包括至少一个待存储电力数据文件；The second classification processing module is configured to perform second classification processing within each of the first classification sets based on the similarity between the power data files to be stored, and form at least one corresponding to each of the first classification sets. second classification sets, each second classification set including at least one power data file to be stored;

分类存储模块，用于对得到的每一个所述第二分类集合分别进行分类存储。A classification storage module, configured to classify and store each obtained second classification set respectively.

在一些优选的实施例中，在上述电力数据文件的处理系统中，所述第一分类处理模块具体用于：In some preferred embodiments, in the above power data file processing system, the first classification processing module is specifically used to:

在一些优选的实施例中，在上述电力数据文件的处理系统中，所述第二分类处理模块具体用于：In some preferred embodiments, in the above power data file processing system, the second classification processing module is specifically used to:

本发明实施例提供的一种电力数据文件的处理系统和方法，对于待存储的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件；利用多个电力数据分析网络，分析出待处理电力数据文件对应的目标电力异常表征数据；基于对应的目标电力异常表征数据，对多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合；基于待存储电力数据文件之间的相似度，分别在每一个第一分类集合内部进行第二分类处理，形成每一个第一分类集合对应的至少一个第二分类集合；对每一个第二分类集合分别进行分类存储。基于前述的内容，通过数据文件之间的相似度和对应的目标电力异常表征数据可以进行两级的分类处理，使得分类的精度更高，从而提高分类存储的可靠度。An embodiment of the present invention provides a system and method for processing power data files. For each power data file to be stored, the power data file to be stored is marked as a power data file to be processed; multiple power data analyzes are used The network analyzes the target power abnormality characterization data corresponding to the power data file to be processed; based on the corresponding target power abnormality characterization data, performs first classification processing on multiple power data files to be stored to form at least one first classification set; based on The similarity between the power data files to be stored is subjected to second classification processing within each first classification set to form at least one second classification set corresponding to each first classification set; each second classification set is processed separately Carry out classified storage. Based on the foregoing content, two-level classification processing can be performed through the similarity between data files and the corresponding target power anomaly characterization data, making the classification more accurate and thereby improving the reliability of classification storage.

为使本发明的上述目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附附图，作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present invention more obvious and understandable, preferred embodiments are given below and described in detail with reference to the accompanying drawings.

附图说明Description of drawings

图1为本发明实施例提供的电力数据文件的处理平台的结构框图。Figure 1 is a structural block diagram of a power data file processing platform provided by an embodiment of the present invention.

图2为本发明实施例提供的电力数据文件的处理方法包括的各步骤的流程示意图。FIG. 2 is a schematic flowchart of each step included in the power data file processing method provided by an embodiment of the present invention.

图3为本发明实施例提供的电力数据文件的处理系统包括的各模块的示意图。FIG. 3 is a schematic diagram of each module included in the power data file processing system provided by the embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例只是本发明的一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are only some of the embodiments of the present invention, not all of them. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.

因此，以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围，而是仅仅表示本发明的选定实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。Therefore, the following detailed description of the embodiments of the invention provided in the appended drawings is not intended to limit the scope of the claimed invention, but rather to represent selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without making creative efforts fall within the scope of protection of the present invention.

如图1所示，本发明实施例提供了一种电力数据文件的处理平台。其中，所述电力数据文件的处理平台可以包括存储器和处理器。As shown in Figure 1, an embodiment of the present invention provides a power data file processing platform. Wherein, the power data file processing platform may include a memory and a processor.

详细地，所述存储器和处理器之间直接或间接地电性连接，以实现数据的传输或交互。例如，相互之间可通过一条或多条通讯总线或信号线实现电性连接。所述存储器中可以存储有至少一个可以以软件或固件(firmware)的形式，存在的软件功能模块(计算机程序)。所述处理器可以用于执行所述存储器中存储的可执行的计算机程序，从而实现本发明实施例(如后文所述)提供的电力数据文件的处理方法。In detail, the memory and the processor are electrically connected directly or indirectly to realize data transmission or interaction. For example, they can be electrically connected to each other through one or more communication buses or signal lines. The memory may store at least one software function module (computer program) that may exist in the form of software or firmware. The processor may be configured to execute an executable computer program stored in the memory, thereby implementing the power data file processing method provided by embodiments of the present invention (as described later).

可以选择的是，在一些实施方式中，所述存储器可以是，但不限于，随机存取存储器(Random Access Memory，RAM)，只读存储器(Read Only Memory，ROM)，可编程只读存储器(Programmable Read-Only Memory，PROM)，可擦除只读存储器(Erasable ProgrammableRead-Only Memory，EPROM)，电可擦除只读存储器(Electric Erasable ProgrammableRead-Only Memory，EEPROM)等。Alternatively, in some implementations, the memory may be, but is not limited to, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), programmable read only memory ( Programmable Read-Only Memory (PROM), Erasable ProgrammableRead-Only Memory (EPROM), Electrically Erasable ProgrammableRead-Only Memory (EEPROM), etc.

可以选择的是，在一些实施方式中，所述处理器可以是一种通用处理器，包括中央处理器(Central Processing Unit，CPU)、网络处理器(Network Processor，NP)、片上系统(System on Chip，SoC)等；还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。Alternatively, in some implementations, the processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a system on a chip (System on Chip). Chip, SoC), etc.; it can also be a digital signal processor (DSP), application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

可以选择的是，在一些实施方式中，所述电力数据文件的处理平台可以是一种具备数据处理能力的服务器。Alternatively, in some implementations, the power data file processing platform may be a server with data processing capabilities.

结合图2，本发明实施例还提供一种电力数据文件的处理方法，可应用于上述电力数据文件的处理平台。其中，所述电力数据文件的处理方法有关的流程所定义的方法步骤，可以由所述电力数据文件的处理平台实现。With reference to FIG. 2 , an embodiment of the present invention also provides a method for processing power data files, which can be applied to the above-mentioned power data file processing platform. Wherein, the method steps defined in the process related to the power data file processing method can be implemented by the power data file processing platform.

下面将对图2所示的具体流程，进行详细阐述。The specific process shown in Figure 2 will be elaborated below.

步骤S100，对于待存储的多个待存储电力数据文件中的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件。Step S100: For each power data file to be stored among the multiple power data files to be stored, mark the power data file to be stored as a power data file to be processed.

在本发明实施例中，所述电力数据文件的处理平台可以对于待存储的多个待存储电力数据文件中的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件(可以依次或并行进行)。In an embodiment of the present invention, the power data file processing platform may mark each of the multiple power data files to be stored as power data to be processed. files (can be done sequentially or in parallel).

步骤S200，利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据。Step S200: Use multiple power data analysis networks to analyze target power abnormality characterization data corresponding to the power data file to be processed.

在本发明实施例中，所述电力数据文件的处理平台可以利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据。所述目标电力异常表征数据用于反映所述待处理电力数据文件对应的电力系统的异常状态。In an embodiment of the present invention, the power data file processing platform can utilize multiple power data analysis networks to analyze the target power abnormality characterization data corresponding to the power data file to be processed. The target power abnormality characterization data is used to reflect the abnormal state of the power system corresponding to the power data file to be processed.

步骤S300，基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合。Step S300: Perform a first classification process on the plurality of power data files to be stored based on the corresponding target power abnormality characterization data to form at least one first classification set.

在本发明实施例中，所述电力数据文件的处理平台可以基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合。每一个所述第一分类集合包括至少一个待存储电力数据文件。In an embodiment of the present invention, the power data file processing platform may perform a first classification process on the plurality of power data files to be stored based on the corresponding target power anomaly characterization data to form at least one first classification set. Each of the first classification sets includes at least one power data file to be stored.

步骤S400，基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合。Step S400: Based on the similarity between the power data files to be stored, perform a second classification process within each of the first classification sets to form at least one second classification set corresponding to each of the first classification sets.

在本发明实施例中，所述电力数据文件的处理平台可以基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合。每一个第二分类集合包括至少一个待存储电力数据文件。In an embodiment of the present invention, the power data file processing platform may perform second classification processing within each of the first classification sets based on the similarity between the power data files to be stored, forming each of the first classification sets. At least one second classification set corresponding to the first classification set. Each second classification set includes at least one power data file to be stored.

步骤S500，对得到的每一个所述第二分类集合分别进行分类存储。Step S500: Classify and store each obtained second classification set respectively.

在本发明实施例中，所述电力数据文件的处理平台可以对得到的每一个所述第二分类集合分别进行分类存储。如此，一个第二分类集合中的各待存储电力数据文件存储到相同的存储设备，以便于后续的调用等。In this embodiment of the present invention, the power data file processing platform may separately classify and store each of the obtained second classification sets. In this way, each power data file to be stored in a second classification set is stored in the same storage device to facilitate subsequent calls.

基于前述的内容，即上述的步骤S100-步骤S500，通过数据文件之间的相似度和对应的目标电力异常表征数据可以进行两级的分类处理，使得分类的精度更高，从而提高分类存储的可靠度。Based on the foregoing content, that is, the above-mentioned steps S100 to S500, two-level classification processing can be performed through the similarity between data files and the corresponding target power abnormality characterization data, so that the classification accuracy is higher, thereby improving the efficiency of classification storage. Reliability.

可以选择的是，在一些实施方式中，所述利用多个电力数据分析网络，分析出所述待处理电力数据文件对应的目标电力异常表征数据的步骤，可以进一步包括以下的内容，如步骤S110、步骤S120和步骤S130。Optionally, in some embodiments, the step of using multiple power data analysis networks to analyze the target power abnormality characterization data corresponding to the power data file to be processed may further include the following content, such as step S110. , step S120 and step S130.

步骤S110，利用多个电力数据分析网络对待处理电力数据文件进行特征挖掘操作，以输出对应的多个初始数据特征表示。Step S110: Use multiple power data analysis networks to perform feature mining operations on the power data files to be processed to output multiple corresponding initial data feature representations.

在本发明实施例中，所述电力数据文件的处理平台可以利用多个电力数据分析网络对待处理电力数据文件进行特征挖掘操作，以输出对应的多个初始数据特征表示。所述多个电力数据分析网络中的每一个电力数据分析网络用于基于加载到的数据，输出对应的电力异常表征数据，所述待处理电力数据文件属于电力系统的运行文本数据，也就是说，所述待处理电力数据文件用于描述电力系统的运行过程。In an embodiment of the present invention, the power data file processing platform can utilize multiple power data analysis networks to perform feature mining operations on the power data files to be processed to output multiple corresponding initial data feature representations. Each power data analysis network in the plurality of power data analysis networks is used to output corresponding power abnormality characterization data based on the loaded data, and the power data file to be processed belongs to the operating text data of the power system, that is to say , the power data file to be processed is used to describe the operating process of the power system.

步骤S120，将所述多个初始数据特征表示进行特征表示的融合操作，以形成对应的聚合数据特征表示。Step S120: Perform a feature representation fusion operation on the plurality of initial data feature representations to form a corresponding aggregated data feature representation.

在本发明实施例中，所述电力数据文件的处理平台可以将所述多个初始数据特征表示进行特征表示的融合操作，以形成对应的聚合数据特征表示。In this embodiment of the present invention, the power data file processing platform may perform a feature representation fusion operation on the multiple initial data feature representations to form a corresponding aggregated data feature representation.

步骤S130，基于所述聚合数据特征表示，分析出所述待处理电力数据文件对应的目标电力异常表征数据。Step S130: Based on the aggregated data feature representation, analyze the target power abnormality characterization data corresponding to the power data file to be processed.

在本发明实施例中，所述电力数据文件的处理平台可以基于所述聚合数据特征表示，分析出所述待处理电力数据文件对应的目标电力异常表征数据，所述目标电力异常表征数据用于反映所述待处理电力数据文件对应的电力系统的异常状态，如是否异常、异常的程度等。In an embodiment of the present invention, the power data file processing platform can analyze the target power abnormality characterization data corresponding to the power data file to be processed based on the aggregated data feature representation, and the target power abnormality characterization data is used for Reflect the abnormal status of the power system corresponding to the power data file to be processed, such as whether it is abnormal, the degree of the abnormality, etc.

基于上述的内容，由于会先利用多个电力数据分析网络进行特征挖掘操作，使得可以得到多个初始数据特征表示，如此，可以进一步融合得到用户分析出目标电力异常表征数据的聚合数据特征表示，也就是说，进行电力异常分析的依据更为充分，因此，可以提高电力数据分析的可靠度。Based on the above content, since multiple power data analysis networks are first used to perform feature mining operations, multiple initial data feature representations can be obtained. In this way, the aggregated data feature representation of the target power anomaly representation data analyzed by the user can be further integrated to obtain, In other words, the basis for conducting power abnormality analysis is more sufficient, and therefore the reliability of power data analysis can be improved.

可以选择的是，在一些实施方式中，所述利用多个电力数据分析网络对待处理电力数据文件进行特征挖掘操作，以输出对应的多个初始数据特征表示的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the step of using multiple power data analysis networks to perform feature mining operations on the power data files to be processed to output corresponding multiple initial data feature representations may further include the following content:

确定出待处理电力数据文件，以及，分析出所述待处理电力数据文件包括的待处理电力数据文件片段，示例性地，可以对所述待处理电力数据文件进行拆分，以形成对应的待处理电力数据文件片段，例如，在形成多个待处理电力数据文件片段时，各个待处理电力数据文件片段之间可以不具有时间上的先后关系，即分别反映不同时间的电力系统的运行数据，或者，各个待处理电力数据文件片段之间可以具有设备上对应关系，即分别反映不同的电力设备的运行数据；Determine the power data file to be processed, and analyze the power data file segments to be processed included in the power data file to be processed. For example, the power data file to be processed can be split to form corresponding power data files to be processed. Processing power data file fragments, for example, when forming multiple power data file fragments to be processed, there may be no temporal sequence relationship between the power data file fragments to be processed, that is, they respectively reflect the operating data of the power system at different times. Alternatively, each to-be-processed power data file fragment may have an on-device correspondence relationship, that is, each of them reflects the operating data of different power equipment;

对所述待处理电力数据文件片段进行标记，以标记为加载到的数据，加载到多个电力数据分析网络中的每一个电力数据分析网络，示例性地，所述多个电力数据分析网络之间的网络参数可以不同，例如，不同的电力数据分析网络之间具有的滤波矩阵的尺寸可以不同，网络架构也可以不同，不同的电力数据分析网络之间具有的滤波矩阵的数量可以不同；The to-be-processed power data file fragment is marked to be loaded into each of the plurality of power data analysis networks. For example, one of the plurality of power data analysis networks The network parameters between different power data analysis networks can be different. For example, the size of the filter matrix between different power data analysis networks can be different, the network architecture can also be different, and the number of filter matrices between different power data analysis networks can be different;

利用所述多个电力数据分析网络中的每一个电力数据分析网络，挖掘出多个初始数据特征表示中的一个初始数据特征表示，一个初始数据特征表示包括待处理电力数据文件片段对应的初始特征表示。Utilize each of the plurality of power data analysis networks to mine an initial data feature representation among a plurality of initial data feature representations, where an initial data feature representation includes an initial feature corresponding to a segment of the power data file to be processed. express.

可以选择的是，在一些实施方式中，所述待处理电力数据文件包括多个所述待处理电力数据文件片段，基于此，所述利用所述多个电力数据分析网络中的每一个电力数据分析网络，挖掘出多个初始数据特征表示中的一个初始数据特征表示的步骤，可以进一步包括以下的内容：Optionally, in some implementations, the power data file to be processed includes a plurality of power data file fragments to be processed. Based on this, the power data in the network is analyzed using the plurality of power data. The step of analyzing the network and mining out one initial data feature representation among multiple initial data feature representations may further include the following:

利用每一个所述电力数据分析网络包括的数据挖掘子网络，将多个所述待处理电力数据文件片段进行数据挖掘操作，所述数据挖掘子网络用于在多个所述待处理电力数据文件片段中挖掘出多个所述初始特征表示，所述数据挖掘操作可以是指特征空间的映射和滤波处理等；Utilize a data mining sub-network included in each of the power data analysis networks to perform data mining operations on multiple power data file segments to be processed, and the data mining sub-network is used to perform data mining operations on multiple power data file segments to be processed. Multiple initial feature representations are mined from the fragments, and the data mining operation may refer to mapping and filtering of feature space, etc.;

确定出多个所述待处理电力数据文件片段的相关关系描述数据，所述相关关系描述数据用于反映多个所述待处理电力数据文件片段在所述待处理电力数据文件中的分布相关关系，如形成时间之间的先后关系；Determine the correlation description data of multiple power data file segments to be processed, and the correlation description data is used to reflect the distribution correlation relationship of the multiple power data file segments to be processed in the power data file to be processed. , such as forming a sequential relationship between times;

基于所述相关关系描述数据，将多个所述初始特征表示进行关联挖掘操作，以输出对应的初始数据特征表示。Based on the correlation description data, a correlation mining operation is performed on a plurality of the initial feature representations to output corresponding initial data feature representations.

可以选择的是，在一些实施方式中，所述利用每一个所述电力数据分析网络包括的数据挖掘子网络，将多个所述待处理电力数据文件片段进行数据挖掘操作的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the step of using a data mining sub-network included in each of the power data analysis networks to perform data mining operations on multiple power data file fragments to be processed may further include: The following content:

对多个所述待处理电力数据文件片段进行加载，以加载到每一个所述电力数据分析网络包括的多个数据挖掘子网络中，所述多个数据挖掘子网络用于在多个所述待处理电力数据文件片段中挖掘出多组中间特征表示，所述多个数据挖掘子网络与所述多组中间特征表示之间一一对应，所述多组中间特征表示中的每一组中间特征表示包括多个中间特征表示，该多个中间特征表示与多个所述待处理电力数据文件片段之间一一对应；Load multiple power data file fragments to be processed into multiple data mining sub-networks included in each of the power data analysis networks, and the multiple data mining sub-networks are used to perform analysis on multiple data mining sub-networks. Multiple sets of intermediate feature representations are mined from the power data file fragments to be processed. There is a one-to-one correspondence between the multiple data mining sub-networks and the multiple sets of intermediate feature representations. Each set of intermediate feature representations in the multiple sets of intermediate feature representations The feature representation includes a plurality of intermediate feature representations, and there is a one-to-one correspondence between the plurality of intermediate feature representations and the plurality of power data file fragments to be processed;

对所述多组中间特征表示中与相同的待处理电力数据文件片段对应的中间特征表示进行合并操作，以形成多个初始特征表示，示例性地，可以对相同的待处理电力数据文件片段对应的中间特征表示进行拼接。Among the multiple sets of intermediate feature representations, the intermediate feature representations corresponding to the same power data file fragment to be processed are merged to form multiple initial feature representations. For example, the same power data file fragment to be processed can be corresponding to The intermediate feature representation is spliced.

可以选择的是，在一些实施方式中，所述基于所述相关关系描述数据，将多个所述初始特征表示进行关联挖掘操作，以输出对应的初始数据特征表示的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the step of performing correlation mining operations on multiple initial feature representations based on the correlation relationship description data to output corresponding initial data feature representations may further include the following: content:

基于所述相关关系描述数据，对所述初始特征表示进行按序加载，以加载到数据关联挖掘单元中；Based on the correlation description data, load the initial feature representation in order to load it into the data association mining unit;

基于所述数据关联挖掘单元挖掘出关联数据特征表示，示例性地，所述数据关联挖掘单元可以按照所述相关关系描述数据，将多个所述初始特征表示进行拼接，以形成对应的关联数据特征表示；Based on the data association mining unit, the associated data feature representation is mined. For example, the data association mining unit can describe data according to the relevant relationship and splice multiple initial feature representations to form corresponding associated data. Feature representation;

利用聚焦特征分析单元，将所述关联数据特征表示进行聚焦特征分析操作，以输出多个待处理数据特征表示，所述聚焦特征分析单元用于基于每一个待处理电力数据文件片段的内容表征重要参数，分析出每一个待处理电力数据文件片段对应的待处理数据特征表示，示例性地，可以基于相邻的初始特征表示，对所述关联数据特征表示中的初始特征表示进行模态间的聚焦特征分析操作，以得到对应的待处理数据特征表示，其中，进行聚焦特征分析操作得到的聚焦特征权重参数，可以作为所述内容表征重要参数，使得可以基于所述内容表征重要参数进行加权，以得到对应的待处理数据特征表示；Utilize a focused feature analysis unit to perform a focused feature analysis operation on the associated data feature representation to output a plurality of data feature representations to be processed. The focused feature analysis unit is used to characterize important content based on each power data file segment to be processed. parameters to analyze the data feature representation to be processed corresponding to each power data file segment to be processed. For example, based on the adjacent initial feature representation, the initial feature representation in the associated data feature representation can be compared between modalities. Focus feature analysis operation to obtain the corresponding feature representation of the data to be processed, wherein the focus feature weight parameter obtained by performing the focus feature analysis operation can be used as the important content representation parameter, so that weighting can be performed based on the important content representation parameter, To obtain the corresponding feature representation of the data to be processed;

利用每一个所述电力数据分析网络包括的特征整合单元，将所述多个待处理数据特征表示进行特征整合操作，以输出对应的初始数据特征表示，所述特征整合单元的处理过程，可以与特征挖掘的过程相反，如进行反滤波处理(如上采样)，以得到对应的初始数据特征表示。Utilize the feature integration unit included in each of the power data analysis networks to perform a feature integration operation on the plurality of data feature representations to be processed to output the corresponding initial data feature representation. The processing process of the feature integration unit can be related to The process of feature mining is reversed, such as performing inverse filtering (such as upsampling) to obtain the corresponding initial data feature representation.

可以选择的是，在一些实施方式中，所述待处理电力数据文件包括多个待处理电力数据文件片段；所述多个初始数据特征表示中每一个初始数据特征表示包括与多个所述待处理电力数据文件片段之间具有一一对应关系的多个初步特征表示，基于此，所述将所述多个初始数据特征表示进行特征表示的融合操作，以形成对应的聚合数据特征表示的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the power data file to be processed includes a plurality of power data file fragments to be processed; each of the multiple initial data feature representations includes a file corresponding to a plurality of the power data file segments to be processed. Process multiple preliminary feature representations that have a one-to-one correspondence between power data file segments. Based on this, the step of performing a feature representation fusion operation on the multiple initial data feature representations to form a corresponding aggregated data feature representation , can further include the following content:

在所述多个初始数据特征表示中，筛选到与多个所述待处理电力数据文件片段之间具有一一对应关系的多个初步特征表示簇，所述多个初步特征表示簇中的每一个初步特征表示簇包括多个所述待处理电力数据文件片段中一个待处理电力数据文件片段在所述多个初始数据特征表示中对应的初步特征表示；Among the plurality of initial data feature representations, a plurality of preliminary feature representation clusters having a one-to-one correspondence with a plurality of the power data file fragments to be processed are selected, and each of the plurality of preliminary feature representation clusters is A preliminary feature representation cluster includes a preliminary feature representation corresponding to one of the plurality of power data file segments to be processed in the plurality of initial data feature representations;

确定出所述多个初步特征表示簇中每一个初步特征表示簇的均值初步特征表示(即对所述初步特征表示簇中的每一个初步特征表示进行均值叠加，以得到均值初步特征表示)，以输出与多个所述待处理电力数据文件片段之间具有一一对应关系的多个均值初步特征表示；Determining the mean preliminary feature representation of each preliminary feature representation cluster in the plurality of preliminary feature representation clusters (that is, performing a mean superposition on each preliminary feature representation in the preliminary feature representation cluster to obtain a mean preliminary feature representation), Represented by multiple mean preliminary features that have a one-to-one correspondence between the output and multiple power data file segments to be processed;

对包括所述多个均值初步特征表示的特征表示进行标记处理，以标记为对应的聚合数据特征表示，也就是说，所述聚合数据特征表示可以包括所述多个均值初步特征表示。The feature representations including the plurality of mean preliminary feature representations are marked to be marked as corresponding aggregated data feature representations. That is to say, the aggregated data feature representation may include the plurality of mean preliminary feature representations.

其中，可以选择的是，在一些实施方式中，所述待处理电力数据文件包括多个待处理电力数据文件片段；所述多个初始数据特征表示中每一个初始数据特征表示包括与多个所述待处理电力数据文件片段之间具有一一对应关系的多个初步特征表示，基于此，所述将所述多个初始数据特征表示进行特征表示的融合操作，以形成对应的聚合数据特征表示的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the power data file to be processed includes a plurality of power data file fragments to be processed; each of the multiple initial data feature representations includes a combination of the multiple initial data feature representations. A plurality of preliminary feature representations having a one-to-one correspondence between the power data file segments to be processed are described. Based on this, the plurality of initial data feature representations are subjected to a feature representation fusion operation to form a corresponding aggregated data feature representation. The steps may further include the following:

确定出所述多个初步特征表示簇中每一个初步特征表示簇的最相关初步特征表示(即对所述初步特征表示簇中的全部初步特征表示进行聚类，以确定出聚类中心，作为最相关初步特征表示)，以输出与多个所述待处理电力数据文件片段之间具有一一对应关系的多个最相关初步特征表示；Determine the most relevant preliminary feature representation of each preliminary feature representation cluster in the plurality of preliminary feature representation clusters (that is, cluster all preliminary feature representations in the preliminary feature representation cluster to determine the cluster center, as The most relevant preliminary feature representation), to output a plurality of most relevant preliminary feature representations that have a one-to-one correspondence with a plurality of the power data file fragments to be processed;

对包括所述多个最相关初步特征表示的特征表示进行标记处理，以标记为对应的聚合数据特征表示，也就是说，所述聚合数据特征表示可以包括所述多个最相关初步特征表示，示例性地，可以对所述多个最相关初步特征表示进行拼接，以形成对应的聚合数据特征表示。Mark the feature representation including the plurality of most relevant preliminary feature representations to mark the corresponding aggregated data feature representation, that is to say, the aggregated data feature representation may include the plurality of most relevant preliminary feature representations, For example, the plurality of most relevant preliminary feature representations may be spliced to form a corresponding aggregated data feature representation.

可以选择的是，在一些实施方式中，所述待处理电力数据文件包括N个待处理电力数据文件片段，所述聚合数据特征表示包括与所述N个待处理电力数据文件片段对应的N个特征表示，基于此，所述基于所述聚合数据特征表示，分析出所述待处理电力数据文件对应的目标电力异常表征数据的步骤，可以进一步包括以下的内容：Optionally, in some implementations, the power data file to be processed includes N power data file segments to be processed, and the aggregate data feature representation includes N corresponding to the N power data file segments to be processed. Feature representation. Based on this, the step of analyzing the target power abnormality characterization data corresponding to the power data file to be processed based on the aggregated data feature representation may further include the following content:

将所述N个特征表示进行全连接操作，得到全连接特征表示；Perform a fully connected operation on the N feature representations to obtain a fully connected feature representation;

对所述全连接特征表示和多个中心特征表示分别进行相似度计算，以输出对应的多个特征表示相似度；Perform similarity calculations on the fully connected feature representation and multiple central feature representations respectively to output the corresponding multiple feature representation similarities;

在所述多个特征表示相似度中确定出一个特征表示相似度(如最大的一个特征表示相似度)，以标记为目标特征表示相似度；Determine one feature representation similarity (such as the largest feature representation similarity) among the plurality of feature representation similarities, and use the mark as the target feature representation similarity;

将所述目标特征表示相似度对应的中心特征表示对应的参考电力异常表征数据，标记为所述待处理电力数据文件对应的目标电力异常表征数据，每一个所述中心特征表示基于具有对应的参考电力异常表征数据的至少一个典型电力数据文件对应的特征表示确定出(进行聚类以确定)。The reference power abnormality characterization data corresponding to the central feature representation corresponding to the similarity of the target feature representation is marked as the target power abnormality characterization data corresponding to the power data file to be processed. Each of the central feature representations is based on the corresponding reference power abnormality representation data. The characteristic representation corresponding to at least one typical power data file of the power abnormality characterization data is determined (clustering is performed to determine).

可以选择的是，在一些实施方式中，在所述利用多个电力数据分析网络对待处理电力数据文件进行特征挖掘操作，以输出对应的多个初始数据特征表示的步骤之前，所述电力数据文件的处理方法还可以进一步包括以下的内容：Optionally, in some embodiments, before the step of using multiple power data analysis networks to perform a feature mining operation on the power data file to be processed to output corresponding multiple initial data feature representations, the power data file The processing method can further include the following content:

基于典型电力数据文件(和对应的实际电力异常表征数据)，将多个待更新电力数据分析网络中的每一个待更新电力数据分析网络进行网络更新操作，形成对应的多个更新电力数据分析网络；Based on typical power data files (and corresponding actual power anomaly characterization data), perform a network update operation on each of the multiple power data analysis networks to be updated to form multiple corresponding updated power data analysis networks. ;

基于所述典型电力数据文件，将多个关联网络中的每一个关联网络进行网络更新操作，形成对应的多个更新后的关联网络，所述多个关联网络中的每一个关联网络包括一个所述更新电力数据分析网络和一个特征表示还原网络，所述特征表示还原网络用于基于所述更新电力数据分析网络分析出的电力异常表征数据还原出所述典型电力数据文件对应的特征表示(如此，可以基于所述更新电力数据分析网络挖掘出的特征表示和所述特征表示还原网络还原出的特征表示之间的差异，确定出对应的误差参数，再基于该误差参数进行网络更新处理)；Based on the typical power data file, a network update operation is performed on each of the multiple associated networks to form corresponding multiple updated associated networks. Each of the multiple associated networks includes a The updated power data analysis network and a feature representation restoration network are used to restore the feature representation corresponding to the typical power data file based on the power abnormality characterization data analyzed by the updated power data analysis network (such as , the corresponding error parameter can be determined based on the difference between the feature representation mined by the updated power data analysis network and the feature representation restored by the feature representation restoration network, and then network update processing is performed based on the error parameter);

基于所述多个更新后的关联网络，确定多个电力数据分析网络，例如，基于所述多个更新后的关联网络的网络参数，构建出电力数据分析网络。Based on the multiple updated associated networks, multiple power data analysis networks are determined. For example, based on the network parameters of the multiple updated associated networks, a power data analysis network is constructed.

可以选择的是，在一些实施方式中，所述基于对应的目标电力异常表征数据，对所述多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the step of performing a first classification process on the plurality of power data files to be stored based on the corresponding target power abnormality characterization data to form at least one first classification set may be Further includes the following:

可以选择的是，在一些实施方式中，所述基于待存储电力数据文件之间的相似度，分别在每一个所述第一分类集合内部进行第二分类处理，形成每一个所述第一分类集合对应的至少一个第二分类集合的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, based on the similarity between the power data files to be stored, a second classification process is performed within each of the first classification sets to form each of the first classification sets. The step of assembling at least one second classification set corresponding to the collection may further include the following content:

对于每一个所述第一分类集合，对该第一分类集合包括的待存储电力数据文件进行数量统计操作，以形成对应的文件数量统计值，并在该文件数量统计值小于或等于预先确定的第一参考值(如5)的情况下，将该第一分类集合确定为对应的第二分类集合，以及，在该文件数量统计值大于所述第一参考值的情况下，将该第一分类集合确定为对应的第三分类集合；For each of the first classification sets, a quantitative statistical operation is performed on the power data files to be stored included in the first classified set to form a corresponding statistical value of the number of files, and when the statistical value of the number of files is less than or equal to a predetermined In the case of the first reference value (such as 5), the first classification set is determined as the corresponding second classification set, and in the case where the file quantity statistical value is greater than the first reference value, the first classification set is determined as the corresponding second classification set. The classification set is determined as the corresponding third classification set;

可以选择的是，在一些实施方式中，所述基于待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, based on the similarity between the power data files to be stored, a second classification process is performed within each of the third classification sets to form each of the third classifications. The step of assembling at least one second classification set corresponding to the collection may further include the following content:

可以选择的是，在一些实施方式中，所述分别对对应的所述关键词序列进行特征挖掘处理，以形成所述待存储电力数据文件对应的关键词特征表示的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the step of performing feature mining processing on the corresponding keyword sequences to form a keyword feature representation corresponding to the power data file to be stored may further include the following: content:

分别计算每一个所述第二关键词对应的词嵌入特征表示、对应的转置词嵌入特征表示和相关词嵌入特征表示进行融合操作，以形成每一个所述第二关键词对应的目标词嵌入特征表示，示例性地，可以对所述转置词嵌入特征表示和所述词嵌入特征表示相乘，再除以该词嵌入特征表示的维度数目，再进行参数的归一化处理，再乘以所述相关词嵌入特征表示，以实现融合，得到对应的目标词嵌入特征表示；Calculate the word embedding feature representation corresponding to each of the second keywords, the corresponding transposed word embedding feature representation and the related word embedding feature representation and perform a fusion operation to form a target word embedding corresponding to each of the second keywords. Feature representation, for example, the transposed word embedding feature representation and the word embedding feature representation can be multiplied, divided by the number of dimensions of the word embedding feature representation, and then normalized parameters, and then multiplied Use the related word embedding feature representation to achieve fusion and obtain the corresponding target word embedding feature representation;

可以选择的是，在一些实施方式中，所述基于所述关键词特征表示，计算出对应的待存储电力数据文件之间的相似度，并基于所述待存储电力数据文件之间的相似度，分别在每一个所述第三分类集合内部进行第二分类处理，形成每一个所述第三分类集合对应的至少一个第二分类集合的步骤，可以进一步包括以下的内容：Optionally, in some embodiments, the similarity between the corresponding power data files to be stored is calculated based on the keyword feature representation, and the similarity between the power data files to be stored is calculated based on the similarity between the power data files to be stored. , the step of performing a second classification process within each of the third classification sets to form at least one second classification set corresponding to each of the third classification sets may further include the following content:

基于余弦配置的多个连续的相似度区间，对所述第三分类集合中每一个待存储电力数据文件进行第二分类处理，以形成所述第三分类集合对应的至少一个第二分类集合，在所述第二分类集合中，包括的每一个待存储电力数据文件对应的余弦相似度属于同一个相似度区间，所述相似度区间可以预先根据实际需求进行配置，在此不做具体的限定。Based on multiple continuous similarity intervals configured by cosine, perform a second classification process on each power data file to be stored in the third classification set to form at least one second classification set corresponding to the third classification set, In the second classification set, the cosine similarity corresponding to each power data file to be stored belongs to the same similarity interval. The similarity interval can be configured in advance according to actual needs, and is not specifically limited here. .

结合图3，本发明实施例还提供一种电力数据文件的处理系统，可应用于上述电力数据文件的处理平台。其中，所述电力数据文件的处理系统可以包括以下的软件功能模块：With reference to FIG. 3 , an embodiment of the present invention also provides a power data file processing system, which can be applied to the above-mentioned power data file processing platform. Wherein, the power data file processing system may include the following software function modules:

可以选择的是，在一些实施方式中，所述第一分类处理模块具体用于：Optionally, in some implementations, the first classification processing module is specifically used to:

可以选择的是，在一些实施方式中，所述第二分类处理模块具体用于：Optionally, in some implementations, the second classification processing module is specifically used to:

综上所述，本发明提供的一种电力数据文件的处理系统和方法，对于待存储的每一个待存储电力数据文件，将该待存储电力数据文件标记为待处理电力数据文件；利用多个电力数据分析网络，分析出待处理电力数据文件对应的目标电力异常表征数据；基于对应的目标电力异常表征数据，对多个待存储电力数据文件进行第一分类处理，以形成至少一个第一分类集合；基于待存储电力数据文件之间的相似度，分别在每一个第一分类集合内部进行第二分类处理，形成每一个第一分类集合对应的至少一个第二分类集合；对每一个第二分类集合分别进行分类存储。基于前述的内容，通过数据文件之间的相似度和对应的目标电力异常表征数据可以进行两级的分类处理，使得分类的精度更高，从而提高分类存储的可靠度。In summary, the present invention provides a power data file processing system and method. For each power data file to be stored, the power data file to be stored is marked as a power data file to be processed; using multiple The power data analysis network analyzes the target power abnormality characterization data corresponding to the power data file to be processed; based on the corresponding target power abnormality characterization data, performs first classification processing on multiple power data files to be stored to form at least one first classification Sets; based on the similarity between the power data files to be stored, perform second classification processing within each first classification set to form at least one second classification set corresponding to each first classification set; for each second classification set Classified collections are stored separately. Based on the foregoing content, two-level classification processing can be performed through the similarity between data files and the corresponding target power anomaly characterization data, making the classification more accurate and thereby improving the reliability of classification storage.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims

1. A method for processing power data files, which is characterized by including:

For each of the plurality of power data files to be stored, marking the power data file to be stored as a power data file to be processed;

Utilize multiple power data analysis networks to analyze the target power abnormality characterization data corresponding to the power data file to be processed, and the target power abnormality characterization data is used to reflect the abnormal state of the power system corresponding to the power data file to be processed;

Based on the corresponding target power abnormality characterization data, a first classification process is performed on the plurality of power data files to be stored to form at least one first classification set, each of the first classification sets including at least one power data file to be stored. ;

Based on the similarity between the power data files to be stored, a second classification process is performed within each of the first classification sets to form at least one second classification set corresponding to each of the first classification sets. The two-category set includes at least one power data file to be stored;

Each obtained second classification set is classified and stored separately.

2. The method of processing power data files according to claim 1, wherein the plurality of power data files to be stored are subjected to a first classification process based on the corresponding target power abnormality characterization data to form at least The steps for a first classification set include:

Perform consistency or similarity analysis on the target power abnormality characterization data corresponding to each two power data files to be stored in the plurality of power data files to be stored;

The analyzed power data files to be stored whose corresponding target power abnormality characterization data are consistent or whose corresponding target power abnormality characterization data belong to the same parameter interval are assigned to the same first classification set to form At least one first category set.

3. The method for processing power data files according to claim 1, wherein the second classification process is performed within each of the first classification sets based on the similarity between the power data files to be stored. , the step of forming at least one second classification set corresponding to each first classification set includes:

For each of the first classification sets, a quantitative statistical operation is performed on the power data files to be stored included in the first classified set to form a corresponding statistical value of the number of files, and when the statistical value of the number of files is less than or equal to a predetermined In the case of the first reference value, the first classification set is determined as the corresponding second classification set, and in the case where the file quantity statistical value is greater than the first reference value, the first classification set is determined as The corresponding third classification set;

Based on the similarity between the power data files to be stored, a second classification process is performed within each of the third classification sets to form at least one second classification set corresponding to each of the third classification sets.

4. The method for processing power data files according to claim 3, wherein the second classification process is performed within each of the third classification sets based on the similarity between the power data files to be stored. , the step of forming at least one second classification set corresponding to each of the third classification sets includes:

Perform a keyword extraction operation on each power data file to be stored in the third classification set to form a keyword sequence corresponding to each power data file to be stored. Each keyword in the keyword sequence belongs to A collection of reference keywords for power system domain configuration;

Perform feature mining processing on the corresponding keyword sequences respectively to form a keyword feature representation corresponding to the power data file to be stored;

Based on the keyword feature representation, the similarity between the corresponding power data files to be stored is calculated, and based on the similarity between the power data files to be stored, it is performed within each of the third classification sets. The second classification process forms at least one second classification set corresponding to each third classification set.

5. The method for processing power data files according to claim 4, wherein feature mining is performed on the corresponding keyword sequences to form keyword features corresponding to the power data files to be stored. The steps represented include:

For each keyword in the keyword sequence, embedding processing is performed on the keyword to form a word embedding feature representation corresponding to the keyword; and, based on the target power data corpus, determining the keyword in the keyword sequence For each keyword, whether there are related keywords in the keyword sequence, and the co-occurrence probability of the related keywords and the corresponding keywords in the target power data corpus is greater than the preset probability;

Each keyword in the keyword sequence that does not have a relevant keyword is marked as a first keyword, and each keyword in the keyword sequence that has a relevant keyword is marked as a second keyword. words, and mark the word embedding feature representation of each first keyword to mark the target word embedding feature representation of the first keyword;

For each second keyword, mark the word embedding feature representation corresponding to the related keyword corresponding to the second keyword to mark the related word embedding feature representation corresponding to the second keyword, and, The related word embedding feature representation is subjected to a transposition operation to form a transposed word embedding feature representation corresponding to the second keyword;

Calculate the word embedding feature representation corresponding to each of the second keywords, the corresponding transposed word embedding feature representation and the related word embedding feature representation and perform a fusion operation to form a target word embedding corresponding to each of the second keywords. Feature representation;

The target word embedding feature representation corresponding to each keyword in the keyword sequence is spliced to form a keyword feature representation corresponding to the power data file to be stored.

6. The method for processing power data files according to claim 4, wherein the similarity between the corresponding power data files to be stored is calculated based on the keyword feature representation, and the similarity between the corresponding power data files to be stored is calculated based on the keyword feature representation. The steps of storing similarities between power data files, performing second classification processing within each of the third classification sets, and forming at least one second classification set corresponding to each of the third classification sets include:

Perform mean calculation on the keyword feature representation corresponding to each power data file to be stored in the third classification set to output the corresponding mean keyword feature representation;

For each power data file to be stored in the third classification set, calculate the cosine similarity between the keyword feature representation corresponding to the power data file to be stored and the mean keyword feature representation to obtain the power to be stored Cosine similarity corresponding to the data file;

Based on multiple continuous similarity intervals configured by cosine, perform a second classification process on each power data file to be stored in the third classification set to form at least one second classification set corresponding to the third classification set, In the second classification set, the cosine similarity corresponding to each included power data file to be stored belongs to the same similarity interval.

7. The method for processing power data files according to any one of claims 1 to 6, characterized in that the target power abnormality representation corresponding to the power data file to be processed is analyzed using multiple power data analysis networks. Data steps include:

Utilizing multiple power data analysis networks to perform feature mining operations on the power data files to be processed to output corresponding multiple initial data feature representations, each of the multiple power data analysis networks is used to perform feature mining based on the loaded data, output corresponding power abnormality characterization data, and the power data file to be processed belongs to the operating text data of the power system;

Perform a feature representation fusion operation on the plurality of initial data feature representations to form a corresponding aggregated data feature representation;

Based on the aggregated data feature representation, the target power abnormality characterization data corresponding to the power data file to be processed is analyzed, and the target power abnormality characterization data is used to reflect the abnormal state of the power system corresponding to the power data file to be processed.

8. A power data file processing system, characterized by including:

A data file marking module, configured to mark each of the plurality of power data files to be stored as a power data file to be stored as a power data file to be processed;

A power anomaly analysis module is used to utilize multiple power data analysis networks to analyze the target power anomaly characterization data corresponding to the power data file to be processed, and the target power anomaly characterization data is used to reflect the corresponding power data file to be processed. Abnormal status of the power system;

A first classification processing module, configured to perform first classification processing on the plurality of power data files to be stored based on the corresponding target power abnormality characterization data to form at least one first classification set, each of the first classification sets Includes at least one power data file to be stored;

The second classification processing module is configured to perform second classification processing within each of the first classification sets based on the similarity between the power data files to be stored, and form at least one corresponding to each of the first classification sets. second classification sets, each second classification set including at least one power data file to be stored;

A classification storage module, configured to classify and store each obtained second classification set respectively.

9. The power data file processing system according to claim 8, wherein the first classification processing module is specifically used for:

10. The power data file processing system according to claim 8, characterized in that the second classification processing module is specifically used to: