CN117909295A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN117909295A
CN117909295A CN202410124911.9A CN202410124911A CN117909295A CN 117909295 A CN117909295 A CN 117909295A CN 202410124911 A CN202410124911 A CN 202410124911A CN 117909295 A CN117909295 A CN 117909295A
Authority
CN
China
Prior art keywords
data
target
index file
processing method
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410124911.9A
Other languages
Chinese (zh)
Inventor
吴宇盛
闫锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GD Midea Heating and Ventilating Equipment Co Ltd
Shanghai Meikong Smartt Building Co Ltd
Original Assignee
GD Midea Heating and Ventilating Equipment Co Ltd
Shanghai Meikong Smartt Building Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GD Midea Heating and Ventilating Equipment Co Ltd, Shanghai Meikong Smartt Building Co Ltd filed Critical GD Midea Heating and Ventilating Equipment Co Ltd
Priority to CN202410124911.9A priority Critical patent/CN117909295A/en
Publication of CN117909295A publication Critical patent/CN117909295A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data processing method and device, and belongs to the field of buildings. The data processing method comprises the following steps: classifying the acquired initial data to obtain multiple types of first data; based on the message category corresponding to the target first data in the multiple types of first data, carrying out standardization processing on the target first data to obtain second data in a target format; and establishing an inverted index based on the second data to obtain and store an index file and the second data. According to the data processing method, the initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the method can support the establishment of indexes for all fields and the rapid full-text retrieval, has strong data integration capability and rapid data query rate, and is suitable for application scenes such as intelligent buildings.

Description

Data processing method and device
Technical Field
The application belongs to the field of buildings, and particularly relates to a data processing method and device.
Background
Intelligent buildings have a large number of intelligent devices, such as air conditioning hosts, modular controllers, fire sensors, etc., monitoring 10000 data points on a minute scale, and can produce more than about 1400 tens of thousands of data a day, about 25 hundred million data records a half year. In the related art, building integrated software based on an industrial personal computer is mainly used for storing and integrating data in a traditional mode (such as a relational database or a nosql database), however, in daily operation and maintenance of an intelligent building, abnormal values such as voltage and current jitter and abnormal vibration of equipment need to be periodically searched, a large amount of operation data need to be integrated, a large amount of equipment needs to be monitored in a second level, the speed of searching or analyzing by adopting a traditional method is low, and the analyzed data amount is small, so that the data processing requirement of the intelligent building cannot be met.
Disclosure of Invention
The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides the data processing method and the data processing device, which can support the establishment of indexes for all fields and the rapid full-text retrieval, have strong data integration capability and faster data query rate, and are suitable for application scenes such as intelligent buildings.
In a first aspect, the present application provides a data processing method, the method comprising:
classifying the acquired initial data to obtain multiple types of first data;
Based on the message category corresponding to the target first data in the multiple types of first data, carrying out standardization processing on the target first data to obtain second data in a target format;
and establishing an inverted index based on the second data to obtain and store an index file and the second data.
According to the data processing method, the initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the method can support the establishment of indexes for all fields and the rapid full-text retrieval, has strong data integration capability and rapid data query rate, and is suitable for application scenes such as intelligent buildings.
According to an embodiment of the present application, the creating an inverted index based on the second data, to obtain and store an index file, includes:
splitting the second data to obtain a plurality of fields;
establishing an index file for each field based on the field value of each field and the document identifier of each field value; the index file comprises a field value corresponding to the index file and an array for storing the document identification;
and storing the index file.
According to one embodiment of the present application, the storing the index file includes:
Splitting the index file under the condition that the size of the index file exceeds a target range;
and storing the field value in a memory, and storing the array in a disk.
According to an embodiment of the present application, the target format is json format, and the normalizing processing is performed on the target first data based on a packet type corresponding to the target first data in the multiple types of first data, so as to obtain second data in the target format, including:
Analyzing the first data into second data in json format based on an analysis processing function corresponding to the message class; the fields of the second data comprise a basic field and a parameter field;
and storing the second data into a target subscription theme of the message queue.
According to one embodiment of the application, the base field includes: at least one of a message type before standardized processing, identification information of a device corresponding to the second data, a type of the device corresponding to the second data and an IP address of the device corresponding to the second data.
According to one embodiment of the application, the parameter field includes: at least one of a data point location identifier, a data point location name, a numerical value of the data point location, a unit of the data point location and a signal type of initial data corresponding to the second data.
According to one embodiment of the application, the initial data includes: at least one of heating and ventilation equipment data, fire-fighting equipment data, lighting equipment data, digital gas meter equipment data and elevator equipment data corresponding to the building system.
According to one embodiment of the present application, after the creating an inverted index based on the second data, an index file and the second data are obtained and stored, the method further includes:
Inquiring from at least one stored index file based on the target field value to obtain a first target index file;
And searching and obtaining a target data record corresponding to the target field value based on a first target array in the first target index file.
According to one embodiment of the present application, after the creating an inverted index based on the second data, an index file and the second data are obtained and stored, the method further includes:
And compressing the index file by using a target compression algorithm based on the discrete degree of the initial data.
In a second aspect, the present application provides a data processing apparatus, the apparatus comprising:
The first processing module is used for classifying the acquired initial data to obtain multiple types of first data;
the second processing module is used for carrying out standardized processing on the target first data based on the message category corresponding to the target first data in the multiple types of first data to obtain second data in a target format;
And the third processing module is used for establishing an inverted index based on the second data, obtaining and storing an index file and the second data.
According to the data processing device, the initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the data processing device can support the establishment of indexes for all fields and support the rapid full-text retrieval, has strong data integration capability and faster data query rate, and is suitable for application scenes such as intelligent buildings.
In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the data processing method according to the first aspect when executing the computer program.
In a fourth aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data processing method as described in the first aspect above.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements a data processing method as described in the first aspect above.
The above technical solutions in the embodiments of the present application have at least one of the following technical effects:
By standardizing the initial data into the second data with the uniform data format, then establishing an inverted index for the second data and simultaneously storing the index file and the data, the method can support the establishment of indexes for all fields and the rapid full-text retrieval, has strong data integration capability and rapid data query rate, and is suitable for application scenes such as intelligent buildings.
Further, the size of the index file is monitored, and when the index file is large, the index file is split, so that the field value is stored in the memory, the array is stored in the disk, and the storage pressure is effectively reduced.
Furthermore, the data is integrated and stored through the inverted index, the index table is firstly queried according to the key words during retrieval, and then the specific data record is retrieved according to the array in the index table, so that the quick full-text retrieval can be supported without full-table scanning, and the quick retrieval rate is realized.
Still further, by compressing the index table using the FOR compression algorithm or the RBM compression algorithm, space occupation can be effectively reduced, thereby accommodating more data.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 2 is a second flowchart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The seed data processing method, the seed data processing device, the electronic equipment and the readable storage medium provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
The data processing method can be applied to the terminal, and can be specifically executed by hardware or software in the terminal.
The terminal includes, but is not limited to, a portable communication device such as a mobile phone or tablet computer. It should also be appreciated that in some embodiments, the terminal may not be a portable communication device, but rather a desktop computer.
In the following various embodiments, a terminal including a display and a touch sensitive surface is described. However, it should be understood that the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and joystick.
The execution main body of the data processing method provided by the embodiment of the application can be an electronic device or a functional module or a functional entity capable of realizing the data processing method in the electronic device, and the electronic device provided by the embodiment of the application comprises, but is not limited to, a mobile phone, a tablet computer, a camera, a wearable device and the like.
As shown in fig. 1, the data processing method includes: step 110, step 120 and step 130.
It should be noted that the data processing method can be applied to intelligent building scenes, and of course, can also be applied to other fields.
Step 110, classifying the acquired initial data to obtain multiple types of first data;
In this step, the initial data may be acquired by a sensor or may be acquired by manual uploading, which is not limited herein.
The initial data may correspond to different data formats.
The initial data is massive data, and the data format of the massive initial data is not uniform.
In some embodiments, the initial data may include data related to devices involved in the operation of the intelligent building.
For example, 10000 data points are monitored at a minute level during operation of an intelligent building, and more than about 1400 tens of thousands of data can be produced a day, and about 25 hundred million data records a half year.
As shown in fig. 2, in some embodiments, the initial data may include: at least one of heating and ventilation equipment data, fire-fighting equipment data, lighting equipment data, digital gas meter equipment data and elevator equipment data corresponding to the building system.
Of course, in other embodiments, the initial data may also include the controller of the module, user-related data, and other types of initial data, and the application is not limited in this regard.
In some embodiments, step 110 may further comprise:
classifying the initial data based on the application scene to obtain multiple types of third data;
And classifying various third data based on the service data and the alarm data to obtain various first data.
In this embodiment, the application scenario is a scenario corresponding to a specific application system corresponding to the initial data.
Taking building data as an example, the obtained initial data can be transversely divided into third data corresponding to different application scenes such as fire-fighting equipment data, electric equipment data, water equipment data, air-conditioning equipment data, elevator equipment data and the like.
For each type of third data, the longitudinal division may be further aligned.
Taking fire-fighting equipment data as an example, the fire-fighting equipment data can be further divided into service data, alarm data and the like, so that first data are obtained.
Of course, in other embodiments, other classification manners may be used to classify the initial data, and the method may be specifically selected based on practical situations, which is not limited by the present application.
In the actual execution process, the step can be executed through a data collection service module in the industrial personal computer, and the data collection service distributes the received data packets or messages into different subscription subjects of the message queue according to the types and records the data packets or messages in the log.
Step 120, based on the message category corresponding to the target first data in the multiple types of first data, performing standardization processing on the target first data to obtain second data in a target format;
In this step, the target first data may be any data of a plurality of types of first data.
The normalization process is used to convert first data of different data formats into the same format.
In the actual implementation process, the target first data may be normalized into the target format based on the message type corresponding to the target first data.
In some embodiments, the target format may be json format.
The implementation of step 120 is specifically described below using the json format as an example of the target format.
With continued reference to FIG. 2, in some embodiments, the target format may be a json format, step 120, which may include:
Analyzing the first data into second data in json format based on the analysis processing function corresponding to the message class;
And storing the second data in a target subscription theme of the message queue.
In this embodiment, the fields of the second data may include a parameter field and a base field.
Wherein the base field is used to characterize the basic properties of the data.
The parameter fields may be represented as arrays, denoted herein by param.
For each record field, a base field and a parameter field may be included.
The target subscription topic may be user-defined based, such as set to a "pre-handle" subscription topic.
For example, after the received data packet or message is distributed to different subscription subjects of the message queue according to the type and recorded in the log, the data message can be obtained from the message queue through a data standardization processing service module in the industrial personal computer, and a corresponding analysis processing function is called according to the type of the message to analyze the message into standard json format data.
And then putting the processed json format data into a pre-handle subscription theme of the message queue.
With continued reference to fig. 2, in some embodiments, the base fields may include: at least one of a message type before standardized processing, identification information of a device corresponding to the second data, an IP address of the device corresponding to the second data, and a type of the device corresponding to the second data.
In this embodiment, the identification information of the device may include an ID, a code, an identification, and the like of the device.
The type of device may include a device type code number, a chinese name of the device type, and the like.
For example, the base field name and its definition may be expressed as follows:
(1) ID: the unique identification of the standardized data record uses a UUID of 64 bits and a hash code of the whole data record;
(2) data_type: the message types before standardized processing, such as Bacnet, OPC, modbus, etc.;
(3) Ch_flag: identification information of the device;
(4) type: a device type code;
(5) type_name: chinese name of the device type;
(6) device ID: a device ID;
(7) device_ip: the IP address of the device.
In some embodiments, the parameter fields may include: at least one of a data point location identifier, a data point location name, a value of the data point location, a signal type of initial data corresponding to the second data, and a unit of the data point location.
In this embodiment, the parameter field is an array comprising actual monitored values of 0 or more data points.
Each element in the array is a json object.
The parameter field may include the following fields:
(1) data_point_flag: a data point location identifier;
(2) data_point_name: a data point location name;
(3) value: the numerical value of the data point location is represented by a character string;
(4) unit: unit, no unit data points the value of this field may be denoted "-";
(5) object_type: the original signal type.
And 130, establishing an inverted index based on the second data to obtain and store an index file and the second data.
In this step, an inverted index (inverted index) determines the position of the record from the attribute value.
The index file (INVERTED FILE) is a file including an inverted index for storing a mapping of the storage location of a word in a document or a group of documents under a full text search.
The second data is stored in the target format.
In some embodiments, the data is stored in json format.
In the actual execution process, taking the above embodiment as an example, after calling the corresponding parsing function according to the type of the message, parsing the parsing function into standard json format data, and placing the processed json format data into the pre-handle subscription theme of the message queue, the standard json format data can be read from the message queue through the data conversion service module in the industrial personal computer, and the second json format data is directly stored; and establishing an inverted index for the second data in json format, and storing the index.
FIG. 2 illustrates a storage format of an index file and a data document (i.e., second data).
In some embodiments, step 130 may include:
splitting the second data to obtain a plurality of fields;
Establishing index files for each field based on the field value of each field and the document identifier of each field value;
The index file is stored.
In this embodiment, the index file includes a field value (key) corresponding to the index file and an array (positing) for holding the document identification.
The structure of the index file is: key: positing.
The document identification may be a document ID.
For example, the data conversion service module reads the json format subscription data (i.e., second data) from the message queue and stores the second data onto the file system;
the data conversion service module splits json data, and each field is defaulted to establish an independent index file (namely an inverted list);
The index file is then saved, such as by the data conversion service module, to memory.
In some embodiments, the data processing thread may store the index file to disk at regular intervals.
In some embodiments, under the condition that the target field in the multiple fields does not need to build an index file, the configuration file corresponding to the target field can be read, and the target field is excluded according to the configuration information in the configuration file.
In some embodiments, all second data is defaulted to an index file.
In the actual implementation process, the target field of the index file can be determined based on the actual requirement without establishing the index file, and the application is not limited.
The inventor finds that in the research and development process, building integrated software based on an industrial personal computer is mainly used for storing and integrating data by adopting a relational database or a nosql database, however, the database is used for storing data, only limited fields can be searched for indexes, in daily operation and maintenance of the intelligent building, massive large data are involved, the database cannot establish indexes for all fields of building equipment data, the data integration efficiency and the integration effect are poor, corresponding data cannot be quickly queried, and the requirements of the intelligent building field cannot be met.
In the application, the acquired massive heterogeneous initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the establishment of indexes for all the involved fields can be supported, the data integration function is strong, and the integration is more comprehensive; in addition, based on the inverted index, the method can also support rapid full-text retrieval, remarkably improve the data query rate, and is suitable for the requirements of large data volume, high data monitoring precision and the like in the intelligent building field.
According to the data processing method provided by the embodiment of the application, the initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the establishment of the index for all fields and the rapid full-text retrieval are supported, the data processing method has strong data integration capability and faster data query rate, and is suitable for application scenes such as intelligent buildings.
In some embodiments, storing the index file may further include:
Splitting the index file under the condition that the size of the index file exceeds the target range;
the field values are stored in memory and the array is stored in disk.
In this embodiment, the target range may be user-defined based, and the application is not limited.
In the actual execution process, after the system is started, a data monitoring thread can be established, the thread is used for monitoring the size of the index file in real time, under the condition that the index file is determined to exceed the target range, the index file can be split to obtain a field value and a number group, the key is stored in a memory, and the positing groups are stored in a disk.
According to the data processing method provided by the embodiment of the application, the size of the index file is monitored, and the index file is split when the index file is large, so that the field value is stored in the memory, the array is stored in the disk, and the storage pressure is effectively reduced.
In some embodiments, after step 130, the method may further comprise:
Reading a second target array in a second target index file under the condition that data to be stored are acquired and the second target index file corresponding to the stored data exists;
And newly adding a document identifier corresponding to the data to be stored in an array included in the second target index file.
In this embodiment, when processing data, if new data needs to be stored, if an index file corresponding to the data to be stored already exists, then corresponding positing in the index file is read and a document ID is newly added.
For example, with continued reference to fig. 2, when the new return air temperature data is acquired at 26.5 ℃, an index file corresponding to the return air temperature ra_t type may be obtained by querying the index file, and then positing corresponding to 06 may be added under the index file list.
With continued reference to fig. 2, in some embodiments, after step 130, the method may further include:
inquiring from the stored at least one index file based on the target field value to obtain a first target index file;
And searching to obtain a target data record corresponding to the target field value based on the first target array in the first target index file.
In this embodiment, the target field value may be any data to be queried, and may be represented as a keyword or the like.
The first target index file is an index file to be queried.
In the actual execution process, when data retrieval is performed, the retrieval program can firstly query the index table according to the keywords and then retrieve specific data records according to positing in the index table.
The inventor also finds that in the research and development process, in building operation monitoring, a large number of devices need to achieve second-level monitoring, abnormal values such as voltage and current jitter and abnormal vibration of the devices need to be periodically searched, a large number of operation data need to be integrated for aggregation analysis of air conditioner loads, and full-table scanning is needed when a database is queried, so that the speed is low; in addition, the full text search of the database does not go through the index, and the corresponding data cannot be quickly queried.
In the application, based on the inverted index, the quick full text retrieval can be supported without full-table scanning, and the query speed is effectively improved.
According to the data processing method provided by the embodiment of the application, the inverted index is used for integrating and storing data, the index table is firstly queried according to the key words during retrieval, and then the specific data record is retrieved according to the array in the index table, so that the quick full-text retrieval can be supported without full-table scanning, and the quick retrieval rate is realized.
In some embodiments, after step 130, the method may further comprise:
the index file is compressed using a target compression algorithm based on the degree of discretization of the initial data.
In this embodiment, the target compression algorithm may include FOR (Frame Of Reference) compression algorithm or RBM (RoaringBitmaps) compression algorithm.
The core idea of the FOR compression algorithm is to use subtraction to cut down the value size.
The core of the RBM compression algorithm is to reduce the value size by division, and the RBM compression algorithm is suitable for compressing discrete data.
In the actual implementation process, along with the increase of index data, the occupied space of the index table is larger and larger, and under the condition that the occupied space of the index table exceeds a certain threshold value, the index table can be compressed by selecting an optimal compression algorithm based on the discrete degree of the data, so that the occupied space is effectively reduced, and more data are accommodated.
According to the data processing method provided by the embodiment of the application, the index table is compressed by adopting the FOR compression algorithm or the RBM compression algorithm, so that the space occupation can be effectively reduced, and more data can be accommodated.
According to the data processing method provided by the embodiment of the application, the execution main body can be a data processing device. In the embodiment of the present application, a data processing device is described by taking a data processing method performed by the data processing device as an example.
The embodiment of the application also provides a data processing device.
As shown in fig. 3, the data processing apparatus includes: a first processing module 310, a second processing module 320, and a third processing module 330.
A first processing module 310, configured to perform classification processing on the obtained initial data to obtain multiple types of first data;
The second processing module 320 is configured to perform standardization processing on the target first data based on a packet class corresponding to the target first data in the multiple classes of first data, to obtain second data in a target format;
the third processing module 330 is configured to build an inverted index based on the second data, and obtain and store an index file and the second data.
According to the data processing device provided by the embodiment of the application, the initial data is standardized into the second data with the uniform data format, then the inverted index is built for the second data, and the index file and the data are stored at the same time, so that the data processing device can support the establishment of indexes for all fields and the rapid full-text retrieval, has strong data integration capability and faster data query rate, and is suitable for application scenes such as intelligent buildings.
In some embodiments, the third processing module 330 may also be configured to:
splitting the second data to obtain a plurality of fields;
Establishing index files for each field based on the field value of each field and the document identifier of each field value; the index file comprises a field value corresponding to the index file and an array for storing the document identification;
The index file is stored.
In some embodiments, the third processing module 330 may also be configured to:
Splitting the index file under the condition that the size of the index file exceeds the target range;
the field values are stored in memory and the array is stored in disk.
In some embodiments, the target format may be json format, and the second processing module 320 may be further configured to:
Analyzing the first data into second data in json format based on the analysis processing function corresponding to the message class; the fields of the second data include a base field and a parameter field;
And storing the second data in a target subscription theme of the message queue.
In some embodiments, the apparatus may further include a fourth processing module to:
after an inverted index is established based on the second data, an index file and the second data are obtained and stored, a first target index file is inquired from at least one stored index file based on a target field value;
And searching to obtain a target data record corresponding to the target field value based on the first target array in the first target index file.
In some embodiments, the apparatus may further include a fifth processing module for:
After the inverted index is established based on the second data, the index file and the second data are obtained and stored, and the index file is compressed by using a target compression algorithm based on the discrete degree of the initial data.
The data processing device in the embodiment of the application can be an electronic device, or can be a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. The electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a mobile internet appliance (mobile INTERNET DEVICE, MID), an augmented reality (augmented reality, AR)/Virtual Reality (VR) device, a robot, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and may also be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., which are not particularly limited in the embodiments of the present application.
The data processing device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an IOS operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.
The data processing device provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to fig. 2, and in order to avoid repetition, a description is omitted here.
In some embodiments, as shown in fig. 4, an electronic device 400 is further provided in the embodiments of the present application, which includes a processor 401, a memory 402, and a computer program stored in the memory 402 and capable of running on the processor 401, where the program, when executed by the processor 401, implements the respective processes of the embodiments of the data processing method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.
The embodiment of the application also provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the above-mentioned data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application also provides a computer program product, which comprises a computer program, and the computer program realizes the data processing method when being executed by a processor.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the data processing method embodiment, and can achieve the same technical effects, so that repetition is avoided, and the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.

Claims (13)

1. A method of data processing, comprising:
classifying the acquired initial data to obtain multiple types of first data;
Based on the message category corresponding to the target first data in the multiple types of first data, carrying out standardization processing on the target first data to obtain second data in a target format;
and establishing an inverted index based on the second data to obtain and store an index file and the second data.
2. The method for processing data according to claim 1, wherein creating an inverted index based on the second data, and obtaining and storing an index file, comprises:
splitting the second data to obtain a plurality of fields;
establishing an index file for each field based on the field value of each field and the document identifier of each field value; the index file comprises a field value corresponding to the index file and an array for storing the document identification;
and storing the index file.
3. The data processing method according to claim 2, wherein the storing the index file includes:
Splitting the index file under the condition that the size of the index file exceeds a target range;
and storing the field value in a memory, and storing the array in a disk.
4. A data processing method according to any one of claims 1 to 3, wherein the target format is json format, and the normalizing the target first data based on the packet class corresponding to the target first data in the multiple types of first data to obtain second data in the target format includes:
Analyzing the first data into second data in json format based on an analysis processing function corresponding to the message class; the fields of the second data comprise a basic field and a parameter field;
and storing the second data into a target subscription theme of the message queue.
5. The data processing method of claim 4, wherein the base field comprises: at least one of a message type before standardized processing, identification information of a device corresponding to the second data, a type of the device corresponding to the second data and an IP address of the device corresponding to the second data.
6. The data processing method of claim 4, wherein the parameter field comprises: at least one of a data point location identifier, a data point location name, a numerical value of the data point location, a unit of the data point location and a signal type of initial data corresponding to the second data.
7. A data processing method according to any one of claims 1 to 3, wherein the initial data comprises: at least one of heating and ventilation equipment data, fire-fighting equipment data, lighting equipment data, digital gas meter equipment data and elevator equipment data corresponding to the building system.
8. A data processing method according to any one of claims 1 to 3, wherein after said creating an inverted index based on said second data, an index file and said second data are obtained and stored, said method further comprises:
Inquiring from at least one stored index file based on the target field value to obtain a first target index file;
And searching and obtaining a target data record corresponding to the target field value based on a first target array in the first target index file.
9. A data processing method according to any one of claims 1 to 3, wherein after said creating an inverted index based on said second data, an index file and said second data are obtained and stored, said method further comprises:
And compressing the index file by using a target compression algorithm based on the discrete degree of the initial data.
10. A data processing apparatus, comprising:
The first processing module is used for classifying the acquired initial data to obtain multiple types of first data;
the second processing module is used for carrying out standardized processing on the target first data based on the message category corresponding to the target first data in the multiple types of first data to obtain second data in a target format;
And the third processing module is used for establishing an inverted index based on the second data, obtaining and storing an index file and the second data.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the data processing method according to any of claims 1-9 when executing the program.
12. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the data processing method according to any one of claims 1-9.
13. A computer program product comprising a computer program which, when executed by a processor, implements the data processing method according to any one of claims 1-9.
CN202410124911.9A 2024-01-29 2024-01-29 Data processing method and device Pending CN117909295A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410124911.9A CN117909295A (en) 2024-01-29 2024-01-29 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410124911.9A CN117909295A (en) 2024-01-29 2024-01-29 Data processing method and device

Publications (1)

Publication Number Publication Date
CN117909295A true CN117909295A (en) 2024-04-19

Family

ID=90697427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410124911.9A Pending CN117909295A (en) 2024-01-29 2024-01-29 Data processing method and device

Country Status (1)

Country Link
CN (1) CN117909295A (en)

Similar Documents

Publication Publication Date Title
EP4242878A1 (en) Method and apparatus for training isolation forest, and method and apparatus for recognizing web crawler
US9256686B2 (en) Using a bloom filter in a web analytics application
CN111311326B (en) User behavior real-time multidimensional analysis method, device and storage medium
JP2022118108A (en) Log auditing method, device, electronic apparatus, medium and computer program
CN107861981B (en) Data processing method and device
CN112491602B (en) Behavior data monitoring method and device, computer equipment and medium
WO2021147559A1 (en) Service data quality measurement method, apparatus, computer device, and storage medium
CN112311612B (en) Information construction method and device and storage medium
CN109145162B (en) Method, apparatus, and computer-readable storage medium for determining data similarity
CN113706100B (en) Real-time detection and identification method and system for Internet of things terminal equipment of power distribution network
CN113010484A (en) Log file management method and device
US20160248724A1 (en) Social Message Monitoring Method and Apparatus
CN113407785A (en) Data processing method and system based on distributed storage system
CN110866249A (en) Method and device for dynamically detecting malicious code and electronic equipment
CN107844536B (en) Method, device and system for selecting application program
CN112130944A (en) Page abnormity detection method, device, equipment and storage medium
CN111026940A (en) Network public opinion and risk information monitoring system and electronic equipment for power grid electromagnetic environment
CN109101595B (en) Information query method, device, equipment and computer readable storage medium
CN117909295A (en) Data processing method and device
CN107992538B (en) Message log generation method and device, query method and information processing system
CN111797422A (en) Data privacy protection query method and device, storage medium and electronic equipment
CN112328464B (en) Index data storage, correlation analysis method, and computer-readable storage medium
CN115225308A (en) Attack group identification method and related equipment for large-scale group attack traffic
CN112312590A (en) Equipment communication protocol identification method and device
CN111274350B (en) Data processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination