WO2023029853A1 - 模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质 - Google Patents

模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质 Download PDF

Info

Publication number
WO2023029853A1
WO2023029853A1 PCT/CN2022/109443 CN2022109443W WO2023029853A1 WO 2023029853 A1 WO2023029853 A1 WO 2023029853A1 CN 2022109443 W CN2022109443 W CN 2022109443W WO 2023029853 A1 WO2023029853 A1 WO 2023029853A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
historical
network behavior
real
model
Prior art date
Application number
PCT/CN2022/109443
Other languages
English (en)
French (fr)
Inventor
连超
江舟
赵军锋
张平荣
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023029853A1 publication Critical patent/WO2023029853A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the embodiments of the present application relate to the technical field of communications, and in particular, to a model training method, a data processing method, electronic equipment, and a computer-readable storage medium.
  • the embodiment of the present application provides a model training method, including: obtaining network information and historical target industry data, the network information is obtained according to the first historical network behavior data; and according to the network information and the historical target Industry data is used for model training to obtain a data processing model.
  • the embodiment of the present application provides a data processing method, including: acquiring real-time network behavior data; and using the data processing model trained by the above model training method to process the real-time network behavior data to obtain a processing result.
  • an embodiment of the present application provides an electronic device, including: at least one processor; and a memory, on which at least one computer program is stored, and when the at least one computer program is executed by the at least one processor, Realize the above-mentioned model training method, or the above-mentioned data processing method.
  • an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above-mentioned model training method or the above-mentioned data processing method is implemented .
  • Fig. 1 is the flowchart of the model training method that the embodiment of the present application provides
  • Fig. 2 is a flow chart of the data processing method provided by the embodiment of the present application.
  • FIG. 3 is a block diagram of a data processing system provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of modules of an electronic device provided by an embodiment of the present application.
  • the network intelligence of power transmission equipment has also been greatly improved.
  • power transmission was of a rough type, which could only continuously transmit power in a large area for a long time, and could not be intelligently adjusted dynamically.
  • the intelligent network of power transmission equipment made it possible to deploy intelligent power supply for regions.
  • Another example is manual meter reading to obtain the total value of power consumption for a period of time. The data is rough, non-real-time, and non-continuous. It is difficult to identify abnormal power consumption behaviors.
  • FIG. 1 is a flow chart of a model training method provided by an embodiment of the present application.
  • the embodiment of the present application provides a model training method, which is applied to a target industry server, and the model training method includes steps 100 and 101 .
  • Step 100 Obtain network information and historical target industry data.
  • the network information is obtained based on the first historical network behavior data.
  • a communication server is set in the communication system, and the communication server can obtain the first historical network behavior data from the communication system.
  • the target industry server can directly obtain the first historical network behavior data from the communication server. That is to say, in some exemplary implementations, the network information includes first historical network behavior data.
  • the communication server can obtain the first historical network behavior data, and convert the first The historical network behavior data is encrypted and then sent to the target industry server; or, the target industry server can obtain a part of the first historical network behavior data from the communication server, while another part of the first historical network behavior data is in the communication After training on the server, provide the training result to the target industry server.
  • the first historical network behavior data includes: the second historical network behavior data and the third historical network behavior data
  • the network information includes: the second historical network behavior data, and according to the third historical The first training result obtained by performing model training on network behavior data.
  • the communication server can obtain the first historical network behavior data and store the first historical network behavior data
  • the network behavior data is encrypted and then sent to the target industry server; or, after the first historical network behavior data needs to be trained on the communication server, the training result is provided to the target industry server. That is to say, in some exemplary implementations, the network information includes a second training result obtained by performing model training according to the first historical network behavior data.
  • the second historical network behavior data includes data that can be directly provided to the target industry server, and the third historical network behavior data includes data that cannot be provided to the target industry server for some reason.
  • the second historical network behavior data includes non-private data in the first historical network behavior data
  • the third historical network behavior data includes private data in the first historical network behavior data.
  • the second historical network behavior data includes data whose amount of data in the first historical network behavior data is greater than or equal to a preset threshold
  • the third historical network behavior data includes data in the first historical network behavior data. Data whose amount of data is less than the preset threshold.
  • the network information can be obtained from the communication server, and the communication server can obtain from the authentication, authorization, accounting (AAA, Authentication Authorization Accounting) authentication server and deep packet inspection (DPI, Deep Packet The Inspection) device obtains the first historical network behavior data.
  • AAA authentication, authorization, accounting
  • DPI Deep Packet The Inspection
  • the communication server obtains user identity information and user private network address information from the AAA authentication server, and collects the first historical network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the historical target industry data includes any one of the following:
  • Historical electricity data historical tap water data, historical natural gas data, historical advertising data, and historical express delivery data.
  • the historical target industry data can be obtained from a dedicated network corresponding to the target industry, can also be manually collected, or can be obtained in any other manner.
  • the first historical network behavior data and the historical target industry data may be different data, and which data is required for model training may be determined according to the actual application scenario of the industry.
  • the first historical network behavior data includes the historical network behavior data in the target area
  • the historical target industry data includes the historical total electricity consumption in the target area.
  • the communication server obtains user identity information and user private network address information in the target area from the AAA authentication server, and collects the first historical network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the first historical network behavior data includes the user's historical network behavior data
  • the historical target industry data includes the user's historical power data.
  • the communication server obtains the user identity information and the user's private network address information from the AAA authentication server, and collects the first historical network behavior data of the corresponding user from the DPI device according to the user's private network address information.
  • the historical power data may be historical power consumption.
  • the user identity information may include information corresponding to the user one-to-one, such as the number of the mobile terminal, the International Mobile Equipment Identity (IMEI, International Mobile Equipment Identity), and the International Mobile Subscriber Identity (IMSI, International Mobile Subscriber Identification Number), etc.
  • IMEI International Mobile Equipment Identity
  • IMSI International Mobile Subscriber Identification Number
  • the first historical network behavior data may include network behavior data in any one or more communication systems.
  • the communication system can be, for example, a mobile communication system (such as a mobile terminal communication system, a vehicle networking communication system, other Internet of Things communication systems, etc.), a fixed network communication system (such as a home wireless fidelity (WiFi, Wireless Fidelity), commercial WiFi, and Enterprise network virtual private network (VPN, Virtual Private Network) and so on).
  • a mobile communication system such as a mobile terminal communication system, a vehicle networking communication system, other Internet of Things communication systems, etc.
  • a fixed network communication system such as a home wireless fidelity (WiFi, Wireless Fidelity), commercial WiFi, and Enterprise network virtual private network (VPN, Virtual Private Network) and so on.
  • WiFi wireless fidelity
  • WiFi Wireless Fidelity
  • VPN Virtual Private Network
  • IP Internet Protocol
  • IP Internet Protocol
  • the system often carries multiple user data on one IP address.
  • IP Internet Protocol
  • various facilities in homes, businesses, and enterprises are gradually becoming intelligent, and user data and device data contain a wide range of feature information.
  • the first historical network behavior data may include historical data flow quintuples, such as time, traffic, packet number, Uniform Resource Locator (URL, Uniform Resource Locator), application type, and the like.
  • historical data flow quintuples such as time, traffic, packet number, Uniform Resource Locator (URL, Uniform Resource Locator), application type, and the like.
  • the first historical network behavior data may also include data obtained by performing data processing on the data stream quintuple.
  • the feature data extracted from the data stream quintuple such as the prediction model for the total power consumption in the target area
  • feature data can include the number of users in the target area, the change law of the number of users in the target area, the target area At least one of the internal user's behavior cycles (such as sleep cycle, leisure cycle, housework cycle, work cycle).
  • the characteristic data can include the inherent attributes of the user's location, the predicted attributes of the user's location, the actual number of users corresponding to the user, and the operating characteristics of the equipment.
  • the number of users in the target area since one IP address in the fixed network communication system corresponds to multiple users, it is necessary to evaluate the number of users corresponding to one IP address in the fixed network communication system. For example, home WiFi, enterprise WiFi, etc., all users share one IP address to access the Internet.
  • mobile terminals are often in one-to-one correspondence with users, since mobile terminals have mobility characteristics, the number of users in the target area can be predicted according to the mobility characteristics of mobile terminals.
  • the inherent attribute of the user's location means that the user's location should be a working area or a living area.
  • the prediction attribute for the user's location refers to whether the user's location is predicted to be a work area or a living area based on network behavior data analysis. For example, a user regularly goes to a residence in a residential area under his name during work during the day, but leaves at night, then the inherent attribute of this residence is the living area, and the predicted attribute is the working area.
  • the operating characteristics of the device it can be the device type, habitual use time period, power consumption, etc.
  • target areas can be divided according to the characteristics of users in different areas, for example, communities with the same work and rest habits can be divided into the same target area, such as a large enterprise park that is mainly white-collar workers during the day.
  • the power demand is large during the day, and the power demand is small at night; for example, in a large living area of a community, a large number of people tend to travel 1 hour to 2 hours earlier and return 1 hour to 2 hours later than people in other areas, then the large living area of the community
  • the demand for power is different from other regions, so it needs to be divided into different target regions for model training.
  • Step 101 perform model training according to network information and historical target industry data to obtain a data processing model.
  • all source data can be obtained for model training in the target industry server, or part of the source data can be obtained for model training in the target industry server, and the other part of source data can be carried out in the communication server Model training; it is also possible not to obtain the source data, but to directly obtain the training result obtained from the model training based on the source data in the communication server.
  • the network information includes the first historical network behavior data
  • the above step 101 includes: according to the first historical network behavior data and the historical target industry The data determines a first training sample, and performs model training according to the first training sample to obtain a data processing model.
  • the first training samples may be updated according to user characteristics input by the user.
  • the first historical network behavior data includes: second historical network behavior data and the third historical network behavior data
  • the network information includes: the second historical network behavior data
  • the above step 101 includes: according to the second historical network behavior data and The historical target industry data determines a second training sample, and performs model training according to the second training sample and the first training result to obtain a data processing model.
  • federated learning methods can be used in communication servers and target industry servers for model training, such as vertical federated learning methods.
  • the second training samples may be updated according to user characteristics input by the user.
  • the network information includes the second training results obtained by performing model training based on the first historical network behavior data.
  • the above step 101 includes: performing model training according to the second training result and historical target industry data to obtain a data processing model.
  • federated learning methods can be used in communication servers and target industry servers for model training, such as vertical federated learning methods.
  • the data processing models are different models for different industries.
  • the model training method provided in the embodiment of the present application does not limit the data processing model.
  • the data processing model can be a total power consumption prediction model; or, the data processing model is a power consumption abnormal behavior detection model; or, the data processing model is a charging station power supply demand prediction model.
  • the data processing model can be a total water consumption prediction model.
  • the network information is used as the input of the data processing model, and the historical total power consumption in the target area is used as the input of the data processing model.
  • the first historical network behavior data includes the historical network behavior data in the target area
  • the historical target industry data includes the historical total electricity consumption in the target area
  • the data processing model includes the total electricity consumption power forecasting model.
  • the data processing model is an abnormal electricity consumption behavior detection model
  • the network information and historical electricity consumption are used as the input of the data processing model, and whether the user has abnormal electricity consumption behavior is used as the output of the data processing model for model training, that is, in some implementations
  • the first historical network behavior data includes the user's historical network behavior data
  • the historical target industry data includes the user's historical power data
  • the data processing model includes a power consumption abnormal behavior detection model.
  • the model training method provided in the embodiment of the present application can use any method such as machine learning algorithm, neural network, long short-term memory network (LSTM, Long Short Tem Memory) to carry out model training.
  • machine learning algorithm e.g., machine learning algorithm, neural network, long short-term memory network (LSTM, Long Short Tem Memory)
  • the model training method provided by the embodiment of this application combines the historical network behavior data of the communication system with the historical target industry data for model training to obtain a data processing model, which greatly improves the accuracy and breadth of the data processing model of the target industry.
  • the relevant electric power industry only has electric power data, lacks user network behavior data, and the feature data used for model training is relatively simple.
  • the model training method provided in the embodiment of the present application combines the user's network behavior data for model training, which increases the user's network behavior data for model training. The dimensionality of the training feature data effectively improves the accuracy.
  • FIG. 2 is a flowchart of a data processing method provided by an embodiment of the present application.
  • the embodiment of the present application provides a data processing method, which is applied to a target industry server, and the data processing method includes steps 200 and 201 .
  • Step 200 acquiring real-time network behavior data.
  • real-time network behavior data and real-time target industry data are acquired.
  • a communication server is set in the communication system, and the communication server can obtain real-time network behavior data from the communication system. After the communication server obtains the real-time historical network behavior data, the target industry server can directly obtain the real-time network behavior data from the communication server.
  • the real-time network behavior data can be obtained from the communication server, and the communication server can obtain the real-time network behavior data from the AAA authentication server and the DPI device.
  • the communication server obtains user identity information and user private network address information from the AAA authentication server, and collects real-time network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the real-time target industry data includes any one of the following:
  • Real-time electricity data real-time tap water data, real-time natural gas data, real-time advertising data, real-time express delivery data.
  • the real-time target industry data can be obtained from a dedicated network corresponding to the target industry, can also be manually collected, or can be obtained in any other way.
  • real-time network behavior data and real-time target industry data may be different data for different industries, and which data needs to be processed can be determined according to the actual application scenario of the industry.
  • the real-time network behavior data includes the real-time network behavior data in the target area.
  • the communication server obtains user identity information and user private network address information in the target area from the AAA authentication server, and collects real-time network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the real-time network behavior data includes the real-time network behavior data of users
  • the real-time target industry data includes real-time power data of users.
  • the communication server obtains user identity information and user private network address information from the AAA authentication server, and collects real-time network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the real-time power data may be real-time power consumption.
  • the user identity information may include information corresponding to the user one-to-one, such as the number of the mobile terminal, IMEI, IMSI, and the like.
  • the real-time network behavior data may include network behavior data in any one or more communication systems.
  • the communication system can be, for example, a mobile communication system (such as a mobile terminal communication system, a vehicle networking communication system, other Internet of Things communication systems, etc.), a fixed network communication system (such as home WiFi, commercial WiFi, enterprise network VPN, etc.).
  • a mobile communication system such as a mobile terminal communication system, a vehicle networking communication system, other Internet of Things communication systems, etc.
  • a fixed network communication system such as home WiFi, commercial WiFi, enterprise network VPN, etc.
  • Different communication systems have different coverage areas.
  • the mobile terminal communication system and the Internet of Vehicles communication system cover residential areas, commercial areas, and enterprise parks. Different communication systems also have different characteristics.
  • IP address of a terminal is often bound to a specific user, which has mobility characteristics and has some specific characteristics in different regions and time periods; while in a fixed network communication system, an IP address often carries Multiple user data.
  • IP address often carries Multiple user data.
  • the real-time network behavior data may include historical data flow quintuples, such as time, traffic, number of packets, URL, application type, and the like.
  • the real-time network behavior data may also include data obtained by performing data processing on the data flow quintuple.
  • the feature data extracted from the data stream quintuple such as the prediction model for the total power consumption in the target area
  • feature data can include the number of users in the target area, the change law of the number of users in the target area, the target area At least one of the internal user's behavior cycles (such as sleep cycle, leisure cycle, housework cycle, work cycle).
  • the feature data can include the inherent attributes of the user's location, the predicted attributes of the user's location, the actual number of users corresponding to the user, and the operating characteristics of the equipment.
  • the number of users in the target area since one IP address in the fixed network communication system corresponds to multiple users, it is necessary to evaluate the number of users corresponding to one IP address in the fixed network communication system. For example, home WiFi, enterprise WiFi, etc., all users share one IP address to access the Internet.
  • mobile terminals are often in one-to-one correspondence with users, since mobile terminals have mobility characteristics, the number of users in the target area can be predicted according to the mobility characteristics of mobile terminals.
  • the inherent attribute of the user's location means that the user's location should be a working area or a living area.
  • the prediction attribute for the user's location refers to whether the user's location is predicted to be a work area or a living area based on network behavior data analysis. For example, if a user regularly goes to a residence in a residential area under his name during work during the day, but leaves at night, then the inherent attribute of this residence is the living area, and the predicted attribute is the working area.
  • the operating characteristics of the device it can be the device type, habitual use time period, power consumption, etc.
  • target areas can be divided according to the characteristics of users in different areas, for example, communities with the same work and rest habits can be divided into the same target area, such as a large enterprise park that is mainly white-collar workers during the day.
  • the power demand is large during the day, and the power demand is small at night; for example, in a large living area of a community, a large number of people tend to travel 1 hour to 2 hours earlier and return 1 hour to 2 hours later than people in other areas, then the large living area of the community
  • the demand for power is different from other regions, so it needs to be divided into different target regions for model training.
  • Step 201 using the data processing model trained by the above model training method to process real-time network behavior data to obtain a processing result.
  • the data processing models are different models for different industries.
  • the embodiment of the present application does not limit the data processing model.
  • the data processing model may be a total power consumption prediction model; or, the data processing model may be a power consumption abnormal behavior detection model.
  • the data processing model can be a total water consumption prediction model.
  • the above step 201 includes: adopting the total power consumption prediction model, according to the real-time Network behavior data predicts total electricity usage in the target area.
  • the data processing model trained by the above model training method is used to process real-time network behavior data and real-time target industry data to obtain processing results.
  • the above step 201 includes: The abnormal electricity consumption behavior detection model is adopted to determine whether the user has abnormal electricity consumption behavior according to the user's real-time network behavior data and the user's real-time target industry data.
  • the data processing method further includes: after processing the real-time network behavior data and real-time target industry data to obtain the processing results, performing intelligent control according to the processing results.
  • performing intelligent control according to the processing results includes: scheduling the power delivered to the target area according to the predicted total power consumption in the target area control.
  • the data processing method provided in the embodiment of the present application uses a data processing model to process real-time network behavior data to obtain processing results, which greatly improves the processing accuracy of target industry data.
  • the embodiment of the present application provides an electronic device, as shown in FIG. 4 , including:
  • At least one processor 401 (only one is shown in FIG. 4 );
  • the memory 402 stores at least one computer program, and when the at least one computer program is executed by the at least one processor 401, the above-mentioned model training method or the above-mentioned data processing method is realized.
  • Processor 401 is a device with data processing capability, including but not limited to central processing unit (CPU) etc.; memory 402 is a device with data storage capability, including but not limited to random access memory (RAM, more specifically SDRAM, DDR etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory (FLASH).
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • FLASH flash memory
  • the processor 401 and the memory 402 are connected to each other through a bus, and further connected to other components of the computing device.
  • the embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned model training method or the above-mentioned data processing method is implemented.
  • FIG. 3 is a block diagram of a data processing system provided by an embodiment of the present application.
  • the embodiment of the present application provides a data processing system, including: a target industry server 301 and a communication server 302 .
  • the target industry server 301 includes a network docking module 3011 , a target industry data training module 3012 , a reasoning module 3013 , a business decision module 3014 , a target industry intelligent control module 3015 , a target industry data collection module 3016 , and a front-end module 3017 .
  • the target industry data collection module 3016 is configured to collect target industry data, and the collected target industry data includes: historical target industry data, or historical target industry data and real-time target industry data.
  • the network connection module 3011 is configured to communicate with the target industry connection module 3021 to obtain network information and real-time network behavior data.
  • the foreground module 3017 is configured to input user characteristics according to actual application scenarios as a supplement to the first historical network behavior data. In some exemplary embodiments, the foreground module 3017 is also configured to display the intelligent control strategy determined by the business decision module 3014 to the user, send the intelligent control strategy input by the user to the target industry intelligent control module 3015, or adjust the strategy input by the user The information is sent to the business decision module 3014.
  • the target industry data training module 3012 is configured to perform model training according to network information and historical target industry data to obtain a data processing model.
  • the reasoning module 3013 is configured to use the trained data processing model to process real-time network behavior data to obtain processing results, or to process real-time network behavior data and real-time target industry data to obtain processing results.
  • the business decision module 3014 is configured to determine an intelligent control strategy according to the processing result. In some exemplary implementations, the business decision module 3014 is further configured to adjust the intelligent control policy according to the policy adjustment information input by the user.
  • the target industry intelligent control module 3015 is configured to perform intelligent control according to the intelligent control strategy determined by the business decision module 3014, or to perform intelligent control according to the intelligent control strategy input by the user.
  • the communication server 302 includes a target industry docking module 3021 , a network data collection module 3022 and a network data training module 3023 .
  • the network data collection module 3022 is configured to obtain user identity information and user private network address information from the AAA authentication server, and collect the first historical network behavior data of the corresponding user from the DPI device according to the user private network address information.
  • the network data training module 3023 is configured to perform model training according to the third historical network behavior data to obtain the first training result, and send the first training result to the target industry docking module 3021; or, perform model training according to the first historical network behavior data to obtain the first training result. Two training results, sending the second training results to the target industry matching module 3021.
  • the target industry docking module 3021 is configured to send the network information to the network docking module 3011; the network information is the first historical network behavior data; or, the network information includes: the third historical network behavior data in the first historical network behavior data will be The first training result obtained from model training, and the second historical network behavior data in the first historical network behavior data; or, the network information includes: the second training result obtained from model training based on the first historical network behavior data.
  • the functional modules/units in the system, and the device can be implemented as software, firmware, hardware, and an appropriate combination thereof.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit circuit.
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage, or may be used Any other medium that stores desired information and can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请提供了一种模型训练方法、一种数据处理方法、一种电子设备、以及一种计算机可读存储介质。所述模型训练方法包括:获取网络信息和历史目标行业数据,所述网络信息根据第一历史网络行为数据获得;以及根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型。所述数据处理方法包括:获取实时网络行为数据;以及采用上述模型训练方法训练得到的数据处理模型,对所述实时网络行为数据进行处理得到处理结果。

Description

模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质
相关申请的交叉引用
本申请要求于2021年9月2日提交的中国专利申请NO.202111030195.0的优先权,该中国专利申请的内容通过引用的方式整体合并于此。
技术领域
本申请实施例涉及通信技术领域,特别涉及模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质。
背景技术
随着经济的高速发展,很多行业的智能化得到了极大的提升。但是,这些行业在进行智能化控制或智能化数据分析时,往往依赖于该行业的历史数据进行智能化控制或智能化数据分析,而当该行业的历史数据的特征维度比较单一时,智能化控制或智能化数据分析的精度比较低。
公开内容
第一方面,本申请实施例提供一种模型训练方法,包括:获取网络信息和历史目标行业数据,所述网络信息根据第一历史网络行为数据获得;以及根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型。
第二方面,本申请实施例提供一种数据处理方法,包括:获取实时网络行为数据;以及采用上述模型训练方法训练得到的数据处理模型,对所述实时网络行为数据进行处理得到处理结果。
第三方面,本申请实施例提供一种电子设备,包括:至少一个处理器;以及存储器,存储器上存储有至少一个计算机程序,当所述 至少一个计算机程序被所述至少一个处理器执行时,实现上述模型训练方法、或上述数据处理方法。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述模型训练方法、或上述数据处理方法。
附图说明
图1为本申请实施例提供的模型训练方法的流程图;
图2为本申请实施例提供的数据处理方法的流程图;
图3为本申请实施例提供的数据处理系统的组成框图;以及
图4为本申请实施例提供的电子设备的模块示意图。
具体实施方式
为使本领域的技术人员更好地理解本申请的技术方案,下面结合附图对本申请提供的模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质进行详细描述。
在下文中将参考附图更充分地描述示例实施例,但是所述示例实施例可以以不同形式来体现,且本申请不应当被解释为限于本文阐述的实施例。提供这些实施例的目的在于使本申请更加透彻和完整,并使本领域技术人员充分理解本申请的范围。
在不冲突的情况下,本申请各实施例及实施例中的各特征可相互组合。
如本文所使用的,术语“和/或”包括至少一个相关列举条目的任何和所有组合。
本文所使用的术语仅用于描述特定实施例,且不限制本申请。如本文所使用的,单数形式“一个”和“该”也包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在特定特征、整体、步骤、操作、元件和/或组件,但不排除存在或可添加至少一个其它特征、整体、步骤、操作、元件、组件和/或其群组。
除非另外限定,否则本文所用的所有术语(包括技术术语和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本申请的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。
随着经济的高速发展,很多行业的智能化得到了极大的提升。例如,随着智慧城市的发展,电力输送设备的网络智能化也得到了极大的提升。举例来说,过往电力输送属于粗犷型,只能长期大区域持续输送电力,无法智能化的动态调节,而电力输送设备的网络智能化使得针对区域性的智能供电调配成为了可能。又如,人力抄表获取一段时间的电量使用总值,数据是粗犷、非实时、非连续值的,识别异常用电行为比较困难。随着第五代移动通信系统(5G,5th Generation Mobile Communication Technology)以及物联网的发展,电力运营企业大规模地部署了自动抄表终端,自动抄表终端拥有用户的电力的详细使用数据,精确到天,甚至到小时或者分钟级别。电力运营企业可以基于用户的电力使用数据进行大数据分析,识别异常的用电行为。
但是,这些行业在进行智能化控制或智能化数据分析时,往往依赖于该行业的历史数据进行智能化控制或智能化数据分析,由于上述行业的历史数据的特征维度比较单一,使得智能化控制或智能化数据分析精度比较低。例如,随着经济的高速发展,全球电力使用量逐年增加,全球环境问题越来越严重。解决环境问题的关键在于在电力运营侧进行开源节流,或在发电侧寻找绿色新能源,如核电、风电、水电等,从而提升电力利用水平。区域性智能供电调配与用户级异常用电行为识别是电力运营企业重点布局的两个方向,通过区域性智能供电调配可以在宏观层面达到有效的节能效果,但是,电力企业仅仅依赖于历史电力数据进行动态调节,具有一定的局限性;用户级异常用电行为识别则在微观层面精确地干预用户的异常用电行为,但是,常见的思路是对不同时间间隔之间的电量进行对比,基本上主要集中在电量突变方向,仅仅依靠电表终端采集的电力数据进行异常用电行为识别具有极大的局限性。
图1为本申请实施例提供的模型训练方法的流程图。
第一方面,参照图1,本申请实施例提供一种模型训练方法,应用于目标行业服务器,该模型训练方法包括步骤100和101。
步骤100、获取网络信息和历史目标行业数据,网络信息根据第一历史网络行为数据获得。
在本申请实施例提供的模型训练方法中,在通信系统中设置通信服务器,通信服务器可以从通信系统中获得第一历史网络行为数据。通信服务器在获得第一历史网络行为数据后,目标行业服务器可以直接从通信服务器中获得第一历史网络行为数据。也就是说,在一些示例性实施方式中,网络信息包括第一历史网络行为数据。
作为选择,当第一历史网络行为数据中存在隐私数据或数据量比较大的数据或由于其他原因不便直接提供给目标行业服务器的数据时,通信服务器可以获取第一历史网络行为数据,将第一历史网络行为数据进行加密后发送给目标行业服务器;或者,目标行业服务器可以从通信服务器中获取第一历史网络行为数据中的一部分数据,而第一历史网络行为数据中的另一部分数据则在通信服务器上进行训练后,将训练结果提供给目标行业服务器。也就是说,在一些示例性实施方式中,第一历史网络行为数据包括:第二历史网络行为数据和第三历史网络行为数据,网络信息包括:第二历史网络行为数据,以及根据第三历史网络行为数据进行模型训练得到的第一训练结果。
作为选择,当第一历史网络行为数据为隐私数据或数据量比较大的数据或由于其他原因不便直接提供给目标行业服务器的数据时,通信服务器可以获取第一历史网络行为数据,将第一历史网络行为数据进行加密后发送给目标行业服务器;或者,第一历史网络行为数据需要在通信服务器上进行训练后,将训练结果提供给目标行业服务器。也就是说,在一些示例性实施方式中,网络信息包括根据第一历史网络行为数据进行模型训练得到的第二训练结果。
在本申请实施例提供的模型训练方法中,第二历史网络行为数据包括可以直接提供给目标行业服务器的数据,第三历史网络行为数据包括由于某种原因不便提供给目标行业服务器的数据。例如,在一 些示例性实施方式中,第二历史网络行为数据包括第一历史网络行为数据中的非隐私数据,第三历史网络行为数据包括第一历史网络行为数据中的隐私数据。又如,在一些示例性实施方式中,第二历史网络行为数据包括第一历史网络行为数据中数据量大于或等于预设阈值的数据,第三历史网络行为数据包括第一历史网络行为数据中数据量小于预设阈值的数据。
在本申请实施例提供的模型训练方法中,网络信息可以从通信服务器中获取得到,通信服务器可以从认证、授权、记账(AAA,Authentication Authorization Accounting)认证服务器和深度包检测(DPI,Deep Packet Inspection)设备获得第一历史网络行为数据。例如,通信服务器从AAA认证服务器中获取用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的第一历史网络行为数据。
在一些示例性实施方式中,历史目标行业数据包括以下任意一个:
历史电力数据、历史自来水数据、历史天然气数据、历史广告数据、历史快递数据。
在一些示例性实施方式中,历史目标行业数据可以从目标行业对应的专用网络中获取得到,也可以人工采集得到,也可以采用其他任意一种方式得到。
在本申请实施例提供的模型训练方法中,针对不同的行业,第一历史网络行为数据和历史目标行业数据可以是不同的数据,可以根据该行业的实际应用场景确定需要哪些数据进行模型训练。例如,对于电网行业,如果需要预测目标区域内的总用电量,则第一历史网络行为数据包括目标区域内的历史网络行为数据,历史目标行业数据包括目标区域内的历史总用电量。这种情况下,通信服务器从AAA认证服务器中获取目标区域内的用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的第一历史网络行为数据。
如果需要检测用户异常用电行为,则第一历史网络行为数据包 括用户的历史网络行为数据,历史目标行业数据包括用户的历史电力数据。这种情况下,通信服务器从AAA认证服务器中获取用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的第一历史网络行为数据。历史电力数据可以是历史用电量。
在一些示例性实施方式中,用户身份信息可以包括与用户一一对应的信息,如移动终端的号码、国际移动设备识别码(IMEI,International Mobile Equipment Identity)、以及国际移动用户识别码(IMSI,International Mobile Subscriber Identification Number)等。
在一些示例性实施方式中,第一历史网络行为数据可以包括任意一个或多个通信系统中的网络行为数据。通信系统例如可以是移动通信系统(如移动终端通信系统、车联网通信系统、其他物联网通信系统等等)、固网通信系统(如家庭无线保真(WiFi,Wireless Fidelity)、商业WiFi、以及企业网虚拟专用网络(VPN,Virtual Private Network)等等)。不同的通信系统覆盖范围均不相同。例如移动终端通信系统、车联网通信系统覆盖了居民区、商业区以及企业园区等。不同的通信系统也具有不同的特点。例如移动通信系统中,终端的互联网协议(IP,Internet Protocol)地址往往是和一个特定用户绑定的,具有移动性特征,在不同区域以及时间段的场景下具有一些特定特征;而固网通信系统往往一个IP地址承载了多个用户数据,随着网络时代的发展,家庭、商业、企业的各种设施都逐步智能化,用户数据和设备数据中包含了广泛的特征信息。
在一些示例性实施方式中,第一历史网络行为数据可以包括历史数据流五元组,如时间、流量、包数、统一资源定位符(URL,Uniform Resource Locator)、应用类型等。
在一些示例性实施方式中,第一历史网络行为数据也可以包括对数据流五元组进行数据处理后得到的数据。例如,从数据流五元组中提取出的特征数据,如针对目标区域内总用电量的预测模型,特征数据可以包括目标区域内的用户数、目标区域内用户数的变化规律、目标区域内用户的行为周期(如睡眠周期、休闲周期、家务周期、工 作周期)中的至少一个。针对用户用电异常行为检测模型,特征数据可以包括用户所在位置的固有属性、用户所在位置的预测属性、用户对应的实际用户数、以及设备运行特征。
针对目标区域内的用户数,由于固网通信系统中一个IP地址对应多个用户,因此,需要对固网通信系统中一个IP地址对应的用户数进行评估。如家庭WiFi、企业WiFi等都是多个用户共用一个IP地址进行上网。而虽然移动终端往往是跟用户一一对应的,但是由于移动终端具有移动特性,因此,可以根据移动终端的移动特性预测目标区域内的用户数。
针对用户所在位置的固有属性,是指用户所在位置本来应该是工作区域或居住区域。针对用户所在位置的预测属性,是指根据网络行为数据分析预测得到的用户所在位置是工作区域或居住区域。例如某个用户白天工作期间固定前往他名下的某个居民区的某个住宅,晚间却离开,那么这个住宅的固有属性是居住区域,预测属性是工作区域。
针对设备运行特征,可以是设备类型、习惯使用时间段、功耗等。
在一些示例性实施方式中,目标区域可以根据不同区域的用户的特征来进行划分,例如将具有同一作息习惯的小区划分为同一目标区域,如某个大型企业园主要是以白领白天办公性质为主,白天电力需求大,夜间电力需求小;如小区某个大型生活片区,大量人群往往比其他区域的人群出行早1小时到2小时,晚归1小时到2小时,那么该小区大型生活片区与其他区域对电力的需求就有差异,需要划分为不同的目标区域分别进行模型训练。
步骤101、根据网络信息和历史目标行业数据进行模型训练得到数据处理模型。
在本申请实施例提供的模型训练方法中,可以获取所有源数据在目标行业服务器中进行模型训练,也可以获取部分源数据在目标行业服务器中进行模型训练,另一部分源数据在通信服务器中进行模型训练;也可以不获取源数据,直接获取通信服务器中根据源数据进行 模型训练得到的训练结果。
在一些示例性实施方式中,针对获取所有源数据在目标行业服务器中进行模型训练的情况,网络信息包括第一历史网络行为数据,上述步骤101包括:根据第一历史网络行为数据和历史目标行业数据确定第一训练样本,根据第一训练样本进行模型训练得到数据处理模型。
在一些示例性实施方式中,可以根据用户输入的用户特征更新第一训练样本。
在一些示例性实施方式中,针对获取部分源数据在目标行业服务器中进行模型训练,另一部分源数据在通信服务器中进行模型训练的情况,第一历史网络行为数据包括:第二历史网络行为数据和第三历史网络行为数据,网络信息包括:第二历史网络行为数据、以及根据第三历史网络行为数据进行模型训练得到的第一训练结果,上述步骤101包括:根据第二历史网络行为数据和历史目标行业数据确定第二训练样本,根据第二训练样本和第一训练结果进行模型训练得到数据处理模型。例如可以在通信服务器和目标行业服务器中采用联邦学习方法进行模型训练,如垂直联邦学习方法。
在一些示例性实施方式中,可以根据用户输入的用户特征更新第二训练样本。
在一些示例性实施方式中,针对不获取源数据,直接获取通信服务器中根据源数据进行模型训练得到的训练结果的情况,网络信息包括根据第一历史网络行为数据进行模型训练得到的第二训练结果,上述步骤101包括:根据第二训练结果和历史目标行业数据进行模型训练得到数据处理模型。例如可以在通信服务器和目标行业服务器中采用联邦学习方法进行模型训练,如垂直联邦学习方法。
在本申请实施例提供的模型训练方法中,针对不同的行业,数据处理模型为不同的模型。本申请实施例提供的模型训练方法对数据处理模型不作限定。例如,针对电网行业,数据处理模型可以为总用电量预测模型;或者,数据处理模型为用电异常行为检测模型;或者,数据处理模型为充电站的供电需求预测模型。针对自来水行业,数据 处理模型可以为总用水量预测模型。在本申请实施例提供的模型训练方法中,当数据处理模型为总用电量预测模型时,将网络信息作为数据处理模型的输入,将目标区域内的历史总用电量作为数据处理模型的输出进行模型训练;即在一些实施方式中,第一历史网络行为数据包括目标区域内的历史网络行为数据,历史目标行业数据为包括目标区域内的历史总用电量,数据处理模型包括总用电量预测模型。当数据处理模型为用电异常行为检测模型时,将网络信息和历史用电量作为数据处理模型的输入,将用户是否存在用电异常行为作为数据处理模型的输出进行模型训练,即在一些实施方式中,第一历史网络行为数据包括用户的历史网络行为数据,历史目标行业数据包括用户的历史电力数据,数据处理模型包括用电异常行为检测模型。
本申请实施例提供的模型训练方法可以采用机器学习算法、神经网络、长短期记忆网络(LSTM,Long Short Tem Memory)等任意一种方法进行模型训练。
本申请实施例提供的模型训练方法,将通信系统的历史网络行为数据与历史目标行业数据结合进行模型训练得到数据处理模型,极大地提升了目标行业的数据处理模型的准确度和广度。例如相关电力行业只有电力数据,缺乏用户的网络行为数据,用于模型训练的特征数据比较单一,而本申请实施例提供的模型训练方法结合用户的网络行为数据进行模型训练,增加了用于模型训练的特征数据的维度,有效地提高了准确度。
图2为本申请实施例提供的数据处理方法的流程图。
第二方面,参照图2,本申请实施例提供一种数据处理方法,应用于目标行业服务器,该数据处理方法包括步骤200和201。
步骤200、获取实时网络行为数据。
在一些示例性实施方式中,获取实时网络行为数据和实时目标行业数据。
在本申请实施例提供的数据处理方法中,在通信系统中设置通信服务器,通信服务器可以从通信系统中获得实时网络行为数据。通信服务器在获得实时历史网络行为数据后,目标行业服务器可以直接 从通信服务器中获得实时网络行为数据。
在本申请实施例提供的数据处理方法中,实时网络行为数据可以从通信服务器中获取得到,通信服务器可以从AAA认证服务器和DPI设备中获得实时网络行为数据。例如,通信服务器从AAA认证服务器中获取用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的实时网络行为数据。
在一些示例性实施方式中,实时目标行业数据包括以下任意一个:
实时电力数据、实时自来水数据、实时天然气数据、实时广告数据、实时快递数据。
在一些示例性实施方式中,实时目标行业数据可以从目标行业对应的专用网络中获取得到,也可以人工采集得到,也可以采用其他任意一种方式得到。
在本申请实施例提供的数据处理方法中,针对不同的行业,实时网络行为数据和实时目标行业数据可以是不同的数据,可以根据该行业的实际应用场景确定需要哪些数据进行处理。例如,对于电网行业,如果需要预测目标区域内的总用电量,则实时网络行为数据包括目标区域内的实时网络行为数据。这种情况下,通信服务器从AAA认证服务器中获取目标区域内的用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的实时网络行为数据。
如果需要检测用户异常用电行为,则实时网络行为数据包括用户的实时网络行为数据,实时目标行业数据包括用户的实时电力数据。这种情况下,通信服务器从AAA认证服务器中获取用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备中采集对应用户的实时网络行为数据。实时电力数据可以是实时用电量。
在一些示例性实施方式中,用户身份信息可以包括与用户一一对应的信息,如移动终端的号码、IMEI、IMSI等。
在一些示例性实施方式中,实时网络行为数据可以包括任意一个或多个通信系统中的网络行为数据。通信系统例如可以是移动通信 系统(如移动终端通信系统、车联网通信系统、其他物联网通信系统等等)、固网通信系统(如家庭WiFi、商业WiFi、企业网VPN等等)。不同的通信系统覆盖范围均不相同。例如移动终端通信系统、车联网通信系统覆盖了居民区、商业区以及企业园区等。不同的通信系统也具有不同的特点。例如移动通信系统中,终端的IP地址往往是和一个特定用户绑定的,具有移动性特征,在不同区域以及时间段的场景下具有一些特定特征;而固网通信系统往往一个IP地址承载了多个用户数据,随着网络时代的发展,家庭、商业、企业的各种设施都逐步智能化,用户数据和设备数据中包含了广泛的特征信息。
在一些示例性实施方式中,实时网络行为数据可以包括历史数据流五元组,如时间、流量、包数、URL、应用类型等。
在一些示例性实施方式中,实时网络行为数据也可以包括对数据流五元组进行数据处理后得到的数据。例如,从数据流五元组中提取出的特征数据,如针对目标区域内总用电量的预测模型,特征数据可以包括目标区域内的用户数、目标区域内用户数的变化规律、目标区域内用户的行为周期(如睡眠周期、休闲周期、家务周期、工作周期)中的至少一个。针对用户用电异常行为检测模型,特征数据可以包括用户所在位置的固有属性、用户所在位置的预测属性、用户对应的实际用户数、设备运行特征。
针对目标区域内的用户数,由于固网通信系统中一个IP地址对应多个用户,因此,需要对固网通信系统中一个IP地址对应的用户数进行评估。如家庭WiFi、企业WiFi等都是多个用户共用一个IP地址进行上网。而虽然移动终端往往是跟用户一一对应的,但是由于移动终端具有移动特性,因此,可以根据移动终端的移动特性预测目标区域内的用户数。
针对用户所在位置的固有属性,是指用户所在位置本来应该是工作区域或居住区域。针对用户所在位置的预测属性,是指根据网络行为数据分析预测得到的用户所在位置是工作区域或居住区域。例如某个用户白天工作期间固定前往他名下的某个居民区的某个住宅,晚间却离开,那么这个住宅的固有属性是居住区域,预测属性是工作区 域。
针对设备运行特征,可以是设备类型、习惯使用时间段、功耗等。
在一些示例性实施方式中,目标区域可以根据不同区域的用户的特征来进行划分,例如将具有同一作息习惯的小区划分为同一目标区域,如某个大型企业园主要是以白领白天办公性质为主,白天电力需求大,夜间电力需求小;如小区某个大型生活片区,大量人群往往比其他区域的人群出行早1小时到2小时,晚归1小时到2小时,那么该小区大型生活片区与其他区域对电力的需求就有差异,需要划分为不同的目标区域分别进行模型训练。
步骤201、采用上述模型训练方法训练得到的数据处理模型,对实时网络行为数据进行处理得到处理结果。
在本申请实施例提供的数据处理方法中,针对不同的行业,数据处理模型为不同的模型。本申请实施例对数据处理模型不作限定。例如,针对电网行业,数据处理模型可以为总用电量预测模型;或者,数据处理模型为用电异常行为检测模型。针对自来水行业,数据处理模型可以为总用水量预测模型。
在一些示例性实施方式中,当实时网络行为数据包括目标区域内的实时网络行为数据,数据处理模型包括总用电量预测模型时,上述步骤201包括:采用总用电量预测模型,根据实时网络行为数据预测目标区域内的总用电量。
在一些示例性实施方式中,采用上述模型训练方法训练得到的数据处理模型,对实时网络行为数据和实时目标行业数据进行处理得到处理结果。
在一些示例性实施方式中,当实时网络行为数据包括用户的实时网络行为数据,实时目标行业数据包括用户的实时目标行业数据,数据处理模型包括用电异常行为检测模型时,上述步骤201包括:采用用电异常行为检测模型,根据用户的实时网络行为数据和用户的实时目标行业数据确定用户是否存在用电异常行为。
在一些示例性实施方式中,该数据处理方法还包括:对实时网 络行为数据和实时目标行业数据进行处理得到处理结果后,根据处理结果进行智能控制。
在一些示例性实施方式中,当数据处理模型为总用电量预测模型时,根据处理结果进行智能控制包括:根据预测的目标区域内的总用电量对配送到目标区域内的电力进行调度控制。
本申请实施例提供的数据处理方法,采用数据处理模型对实时网络行为数据进行处理得到处理结果,极大地提升了目标行业数据的处理准确度。
第三方面,本申请实施例提供一种电子设备,如图4所示,包括:
至少一个处理器401(图4中仅示出一个);以及
存储器402,存储器上存储有至少一个计算机程序,当所述至少一个计算机程序被所述至少一个处理器401执行时,实现上述模型训练方法、或上述数据处理方法。
处理器401为具有数据处理能力的器件,包括但不限于中央处理器(CPU)等;存储器402为具有数据存储能力的器件,包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH)。
在一些实施方式中,处理器401、存储器402通过总线相互连接,进而与计算设备的其它组件连接。
第四方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述模型训练方法、或上述数据处理方法。
图3为本申请实施例提供的数据处理系统的组成框图。
第五方面,本申请实施例提供一种数据处理系统,包括:目标行业服务器301和通信服务器302。
目标行业服务器301包括网络对接模块3011、目标行业数据训练模块3012、推理模块3013、业务决策模块3014、目标行业智能控制模块3015、目标行业数据采集模块3016、前台模块3017。
下面分别描述各个模块的功能。
目标行业数据采集模块3016配置为采集目标行业数据,采集的目标行业数据包括:历史目标行业数据,或历史目标行业数据和实时目标行业数据。
网络对接模块3011配置为与目标行业对接模块3021通信,以获取网络信息和实时网络行为数据。
前台模块3017配置为根据实际应用场景需要输入用户特征,作为第一历史网络行为数据的补充。在一些示例性实施方式中,前台模块3017还配置为向用户显示业务决策模块3014确定的智能控制策略,将用户输入的智能控制策略发送给目标行业智能控制模块3015,或将用户输入的策略调整信息发送给业务决策模块3014。
目标行业数据训练模块3012配置为根据网络信息和历史目标行业数据进行模型训练得到数据处理模型。
推理模块3013配置为采用训练好的数据处理模型,对实时网络行为数据进行处理得到处理结果,或者对实时网络行为数据和实时目标行业数据进行处理得到处理结果。
业务决策模块3014配置为根据处理结果确定智能控制策略。在一些示例性实施方式中,业务决策模块3014还配置为根据用户输入的策略调整信息调整智能控制策略。
目标行业智能控制模块3015配置为根据业务决策模块3014确定的智能控制策略进行智能控制,或根据用户输入的智能控制策略进行智能控制。
通信服务器302包括目标行业对接模块3021、网络数据采集模块3022、网络数据训练模块3023。
下面分别描述各个模块的功能。
网络数据采集模块3022配置为从AAA认证服务器中获取用户身份信息和用户私网地址信息,根据用户私网地址信息从DPI设备采集对应用户的第一历史网络行为数据。
网络数据训练模块3023配置为根据第三历史网络行为数据进行模型训练得到第一训练结果,将第一训练结果发送给目标行业对接模块3021;或者,根据第一历史网络行为数据进行模型训练得到第二 训练结果,将第二训练结果发送给目标行业对接模块3021。
目标行业对接模块3021配置为将网络信息发送给网络对接模块3011;网络信息为第一历史网络行为数据;或者,网络信息包括:将根据第一历史网络行为数据中的第三历史网络行为数据进行模型训练得到的第一训练结果,以及第一历史网络行为数据中的第二历史网络行为数据;或者,网络信息包括:根据第一历史网络行为数据进行模型训练得到的第二训练结果。
上述各个模块的具体实现过程与前述的模型训练方法和数据处理方法的具体实现过程相同,这里不再赘述。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器(如中央处理器、数字信号处理器或微处理器)执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其它数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其它存储器技术、CD-ROM、数字多功能盘(DVD)或其它光盘存储、磁盒、磁带、磁盘存储或其它磁存储器、或者可以用于存储期望的信息并且可以被计算机访问的任何其它的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其它传输机制之类的调制数据信号中的其它数据,并且可包括任何信息递送介质。
本文已经公开了示例实施例,并且虽然采用了具体术语,但它 们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则与特定实施例相结合描述的特征、特性和/或元素可单独使用,或可与结合其它实施例描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本申请的范围的情况下,可进行各种形式和细节上的改变。

Claims (16)

  1. 一种模型训练方法,包括:
    获取网络信息和历史目标行业数据;其中,所述网络信息根据第一历史网络行为数据获得;以及
    根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型。
  2. 根据权利要求1所述的模型训练方法,其中,所述网络信息包括所述第一历史网络行为数据;所述根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型包括:
    根据所述第一历史网络行为数据和所述历史目标行业数据确定第一训练样本,根据所述第一训练样本进行模型训练得到所述数据处理模型。
  3. 根据权利要求1所述的模型训练方法,其中,所述第一历史网络行为数据包括:第二历史网络行为数据和第三历史网络行为数据,所述网络信息包括:所述第二历史网络行为数据、以及根据所述第三历史网络行为数据进行模型训练得到的第一训练结果;
    所述根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型包括:
    根据所述第二历史网络行为数据和所述历史目标行业数据确定第二训练样本,根据所述第二训练样本和所述第一训练结果进行模型训练得到所述数据处理模型。
  4. 根据权利要求3所述的模型训练方法,其中,所述第二历史网络行为数据包括所述第一历史网络行为数据中的非隐私数据,所述第三历史网络行为数据包括所述第一历史网络行为数据中的隐私数据。
  5. 根据权利要求1所述的模型训练方法,其中,所述网络信息包括根据所述第一历史网络行为数据进行模型训练得到的第二训练结果;
    所述根据所述网络信息和所述历史目标行业数据进行模型训练得到数据处理模型包括:
    根据所述第二训练结果和所述历史目标行业数据进行模型训练得到数据处理模型。
  6. 根据权利要求1所述的模型训练方法,其中,所述历史目标行业数据包括以下任意一个:
    历史电力数据、历史自来水数据、历史天然气数据、历史广告数据、历史快递数据。
  7. 根据权利要求1至6中任意一项所述的模型训练方法,其中,所述第一历史网络行为数据包括目标区域内的历史网络行为数据,所述历史目标行业数据包括所述目标区域内的历史总用电量,所述数据处理模型包括总用电量预测模型。
  8. 根据权利要求1至6中任意一项所述的模型训练方法,其中,所述第一历史网络行为数据包括用户的历史网络行为数据,所述历史目标行业数据包括所述用户的历史电力数据,所述数据处理模型包括用电异常行为检测模型。
  9. 一种数据处理方法,包括:
    获取实时网络行为数据;以及
    采用权利要求1至8中任意一项所述的模型训练方法训练得到的数据处理模型,对所述实时网络行为数据进行处理得到处理结果。
  10. 根据权利要求9所述的数据处理方法,还包括:在所述对所述实时网络行为数据进行处理得到处理结果后,根据所述处理结果进 行智能控制。
  11. 根据权利要求10所述的数据处理方法,其中,所述实时网络行为数据包括目标区域内的实时网络行为数据,所述数据处理模型包括总用电量预测模型;
    所述采用权利要求1至8中任意一项所述的模型训练方法训练得到的数据处理模型,对所述实时网络行为数据进行处理得到处理结果包括:采用所述总用电量预测模型,根据所述目标区域内的实时网络行为数据预测所述目标区域内的总用电量;
    所述根据所述处理结果进行智能控制包括:根据预测的目标区域内的总用电量对配送到所述目标区域内的电力进行调度控制。
  12. 根据权利要求9所述的数据处理方法,还包括:在所述对所述实时网络行为数据进行处理得到处理结果之前,获取实时目标行业数据;
    所述对所述实时网络行为数据进行处理得到处理结果包括:对所述实时网络行为数据和所述实时目标行业数据进行处理得到处理结果。
  13. 根据权利要求12所述的数据处理方法,其中,所述实时网络行为数据包括用户的实时网络行为数据,所述实时目标行业数据包括所述用户的实时目标行业数据,所述数据处理模型包括用电异常行为检测模型;
    所述采用权利要求1至8中任意一项所述的模型训练方法训练得到的数据处理模型,对所述实时网络行为数据和所述实时目标行业数据进行处理得到处理结果包括:采用所述用电异常行为检测模型,根据所述用户的实时网络行为数据和所述用户的实时目标行业数据确定所述用户是否存在用电异常行为。
  14. 根据权利要求9至13中任意一项所述的数据处理方法,其 中,所述实时目标行业数据包括以下任意一个:
    实时电力数据、实时自来水数据、实时天然气数据、实时广告数据、实时快递数据。
  15. 一种电子设备,包括:
    至少一个处理器;以及
    存储器,所述存储器上存储有至少一个计算机程序,当所述至少一个计算机程序被所述至少一个处理器执行时,实现权利要求1至8中任意一项所述的模型训练方法、或权利要求9至14中任意一项所述的数据处理方法。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至8中任意一项所述的模型训练方法、或权利要求9至14中任意一项所述的数据处理方法。
PCT/CN2022/109443 2021-09-02 2022-08-01 模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质 WO2023029853A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111030195.0A CN115759223A (zh) 2021-09-02 2021-09-02 模型训练方法、数据处理方法、电子设备、可读存储介质
CN202111030195.0 2021-09-02

Publications (1)

Publication Number Publication Date
WO2023029853A1 true WO2023029853A1 (zh) 2023-03-09

Family

ID=85332897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109443 WO2023029853A1 (zh) 2021-09-02 2022-08-01 模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN115759223A (zh)
WO (1) WO2023029853A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041073A (zh) * 2023-09-05 2023-11-10 广州天懋信息系统股份有限公司 网络行为预测方法、系统、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153887A (zh) * 2017-04-14 2017-09-12 华南理工大学 一种基于卷积神经网络的移动用户行为预测方法
CN111797858A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 模型训练方法、行为预测方法、装置、存储介质及设备
US20210018347A1 (en) * 2019-07-17 2021-01-21 Exxonmobil Research And Engineering Company Intelligent system for identifying sensor drift
CN112801374A (zh) * 2021-01-29 2021-05-14 广东晨兴智能科技有限公司 模型训练方法、用电负荷预测方法、装置及设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153887A (zh) * 2017-04-14 2017-09-12 华南理工大学 一种基于卷积神经网络的移动用户行为预测方法
CN111797858A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 模型训练方法、行为预测方法、装置、存储介质及设备
US20210018347A1 (en) * 2019-07-17 2021-01-21 Exxonmobil Research And Engineering Company Intelligent system for identifying sensor drift
CN112801374A (zh) * 2021-01-29 2021-05-14 广东晨兴智能科技有限公司 模型训练方法、用电负荷预测方法、装置及设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041073A (zh) * 2023-09-05 2023-11-10 广州天懋信息系统股份有限公司 网络行为预测方法、系统、设备及存储介质
CN117041073B (zh) * 2023-09-05 2024-05-28 广州天懋信息系统股份有限公司 网络行为预测方法、系统、设备及存储介质

Also Published As

Publication number Publication date
CN115759223A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
Qian et al. Survey of wireless big data
KR20160086812A (ko) 속성과 네트워크 주소 사이의 연관
Ho et al. Wireless communications networks for the smart grid
Kong et al. Multimedia data fusion method based on wireless sensor network in intelligent transportation system
WO2016070673A1 (zh) 用户属性分析方法及装置
Inagaki et al. Prioritization of mobile IoT data transmission based on data importance extracted from machine learning model
Sun et al. Integrated human-machine intelligence for EV charging prediction in 5G smart grid
WO2023029853A1 (zh) 模型训练方法、数据处理方法、电子设备、以及计算机可读存储介质
Chen et al. Secure centralized spectrum sensing for cognitive radio networks
CN108769926B (zh) 基于群体感知层的车联网隐私保护方法及车联网构架
Sun et al. Mobile data traffic prediction by exploiting time-evolving user mobility patterns
CN111260505B (zh) 基于电力物联网的大数据分析方法、装置及计算机设备
Kenner et al. Comparison of smart grid architectures for monitoring and analyzing power grid data via Modbus and REST
Zou et al. Electric load profile of 5G base station in distribution systems based on data flow analysis
Sheng et al. Toward an energy and resource efficient internet of things: A design principle combining computation, communications, and protocols
Baier et al. MapCorrect: Automatic correction and validation of road maps using public sensing
CN111131493A (zh) 一种数据获取、用户画像生成方法、装置
Hayes et al. Multi‐nodal short‐term energy forecasting using smart meter data
Přibyl et al. Definition of a smart street as smart city's building element
Zhang et al. Endogenous security-aware resource management for digital twin and 6G edge intelligence integrated smart park
Das et al. Measuring trustworthiness of smart meters leveraging household energy consumption profile
Liu et al. A SDN-based intelligent prediction approach to power traffic identification and monitoring for smart network access
Abdurohman et al. Integrated lighting enabler system using M2M platforms for enhancing energy efficiency
CN116245246A (zh) 基于物联网的智慧城市供电管理方法、系统及存储介质
Shinkuma et al. System design for predictive road-traffic information delivery using edge-cloud computing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862994

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE