CN115759223A - Model training method, data processing method, electronic device and readable storage medium - Google Patents

Model training method, data processing method, electronic device and readable storage medium Download PDF

Info

Publication number
CN115759223A
CN115759223A CN202111030195.0A CN202111030195A CN115759223A CN 115759223 A CN115759223 A CN 115759223A CN 202111030195 A CN202111030195 A CN 202111030195A CN 115759223 A CN115759223 A CN 115759223A
Authority
CN
China
Prior art keywords
data
historical
network behavior
real
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111030195.0A
Other languages
Chinese (zh)
Inventor
连超
江舟
赵军锋
张平荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202111030195.0A priority Critical patent/CN115759223A/en
Priority to PCT/CN2022/109443 priority patent/WO2023029853A1/en
Publication of CN115759223A publication Critical patent/CN115759223A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a model training method, a data processing method, an electronic device and a computer readable storage medium, wherein the model training method comprises the following steps: acquiring network information and historical target industry data; wherein the network information is obtained from first historical network behavior data; and performing model training according to the network information and the historical target industry data to obtain a data processing model. The data processing method comprises the following steps: acquiring real-time network behavior data; and processing the real-time network behavior data by adopting a data processing model obtained by training by any one of the model training methods to obtain a processing result.

Description

Model training method, data processing method, electronic device and readable storage medium
Technical Field
The embodiment of the application relates to the technical field of communication, in particular to a model training method, a data processing method, electronic equipment and a computer-readable storage medium.
Background
Along with the rapid development of economy, the intellectualization of many industries is greatly improved. However, when the industries perform intelligent control or intelligent data analysis, the industries often rely on historical data of the industries to perform intelligent control or intelligent data analysis, and the feature dimension of the historical data of the industries is single, so that the accuracy of the intelligent control or intelligent data analysis is low.
Disclosure of Invention
The embodiment of the application provides a model training method, a data processing method, electronic equipment and a computer readable storage medium.
In a first aspect, an embodiment of the present application provides a model training method, including: acquiring network information and historical target industry data; wherein the network information is obtained from first historical network behavior data; and performing model training according to the network information and the historical target industry data to obtain a data processing model.
In a second aspect, an embodiment of the present application provides a data processing method, including: acquiring real-time network behavior data; and processing the real-time network behavior data by adopting a data processing model obtained by training by any one of the model training methods to obtain a processing result.
In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor; a memory having at least one program stored thereon, the at least one program, when executed by the at least one processor, implementing any of the above-described model training methods, or any of the above-described data processing methods.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the above-mentioned model training methods, or any one of the above-mentioned data processing methods.
According to the model training method provided by the embodiment of the application, the historical network behavior data of the communication system and the historical target industry data are combined to perform model training to obtain the data processing model, and the accuracy and the breadth of the data processing model of the target industry are greatly improved. For example, the traditional power industry only has power data and lacks network behavior data of users, and the feature data for model training is relatively single, but the model training method of the embodiment of the application performs model training by combining the network behavior data of the users, so that the dimensionality of the feature data for model training is increased, and the accuracy is effectively improved.
According to the data processing method provided by the embodiment of the application, the data processing model is adopted to process the real-time network behavior data to obtain the processing result, and the processing accuracy of the target industry data is greatly improved.
Drawings
FIG. 1 is a flowchart of a model training method according to an embodiment of the present application;
FIG. 2 is a flow chart of a data processing method according to another embodiment of the present application;
FIG. 3 is a block diagram of a data processing system according to another embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present application, the model training method, the data processing method, the electronic device, and the computer readable storage medium provided in the present application are described in detail below with reference to the accompanying drawings.
Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In case of conflict, the embodiments and features of the embodiments of the present application can be combined with each other.
As used herein, the term "and/or" includes any and all combinations of at least one of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes," and/or "made from" \8230; \8230 ";" made from ";" specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of at least one other feature, integer, step, operation, element, component, and/or group thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present application and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Along with the rapid development of economy, the intellectualization of many industries is greatly improved. For example, with the development of smart cities, the network intelligence of power transmission equipment has also been greatly improved. For example, traditional power transmission in the past is rough, only power can be continuously transmitted in a large area for a long time, intelligent dynamic adjustment cannot be achieved, and intelligent network of power transmission equipment enables regional intelligent power supply to be adjusted. For another example, the traditional manual meter reading method obtains the total electricity consumption value in a period of time, the data is rough, non-real-time and discontinuous, and the identification of abnormal electricity consumption behaviors is difficult. With the fifth generation mobile communication system (5g th Generation) and the development of the internet of things, automatic meter reading terminals are deployed on a large scale by power operation enterprises, and the automatic meter reading terminals have detailed use data of power of users, and are accurate to the day level, even to the hour level or the minute level. The power operation enterprise can perform big data analysis based on the power use data and identify abnormal power consumptionAnd (6) behaviors.
However, when performing intelligent control or intelligent data analysis in these industries, the intelligent control or intelligent data analysis is often performed depending on the historical data of the industry, and because the characteristic dimension of the historical data of the industry is relatively single, the accuracy of the intelligent control or intelligent data analysis is relatively low. For example, with the rapid development of economy, the global power usage is increasing year by year, and the global environmental problem is becoming more serious. The key point of solving the environmental problem lies in carrying out the open source throttle at the electric power operation side, or look for green new forms of energy at the electricity generation side, like nuclear power, wind-powered electricity generation, water and electricity etc. to promote the electric power utilization level. Regional intelligent power supply allocation and user-level abnormal power consumption behavior identification are two directions of key layout of power operation enterprises, and effective energy-saving effect can be achieved on a macroscopic level through the regional intelligent power supply allocation, but the power operation enterprises only rely on historical power data for dynamic adjustment and have certain limitation; user level unusual power consumption action discernment then at the accurate unusual power consumption action of intervention user of micro-level, however, common thinking is to the electric quantity contrast between the different time interval, mainly concentrates on electric quantity sudden change direction basically, only relies on the electric power data of ammeter terminal collection to carry out unusual power consumption action and discerns and have very big limitation.
Fig. 1 is a flowchart of a model training method according to an embodiment of the present application.
In a first aspect, referring to fig. 1, an embodiment of the present application provides a model training method applied to a target industry server, including:
step 100, acquiring network information and historical target industry data; wherein the network information is obtained from the first historical network behavior data.
In the embodiment of the application, a communication server is arranged in the communication system, and the communication server can obtain the first historical network behavior data from the communication system. After the communication server obtains the first historical network behavior data, the target industry server may directly obtain the first historical network behavior data from the communication server. That is, in some example embodiments, the network information is first historical network behavior data.
Or, when private data or data with a large data volume or data which is inconvenient to directly provide to the target industry server due to other reasons exist in the first historical network behavior data, the communication server may acquire the first historical network behavior data, encrypt the first historical network behavior data, and send the encrypted first historical network behavior data to the target industry server; alternatively, the target industry server may obtain a part of the first historical network behavior data from the communication server, and provide the training result to the target industry server after another part of the first historical network behavior data is trained on the communication server. That is, in other exemplary embodiments, the first historical network behavior data includes: second historical network behavior data and third historical network behavior data, the network information including: the second historical network behavior data and a first training result obtained by performing model training according to the third historical network behavior data.
Or, when the first historical network behavior data is private data or data with a large data volume or data which is inconvenient to directly provide to the target industry server due to other reasons, the communication server may obtain the first historical network behavior data, encrypt the first historical network behavior data, and send the encrypted first historical network behavior data to the target industry server; or after the first historical network behavior data needs to be trained on the communication server, the training result is provided for the target industry server. That is, in other exemplary embodiments, the network information is a second training result obtained by performing model training based on the first historical network behavior data.
In the embodiment of the application, the second historical network behavior data is data which can be directly provided for the target industry server, and the third historical network behavior data is data which is inconvenient to provide for the target industry server for some reason. For example, in some example embodiments, the second historical network behavior data is non-private data in the first historical network behavior data, and the third historical network behavior data is private data in the first historical network behavior data. For another example, in other exemplary embodiments, the second historical network behavior data is data in which the data amount in the first historical network behavior data is greater than or equal to a preset threshold, and the third historical network behavior data is data in which the data amount in the first historical network behavior data is less than the preset threshold.
In this embodiment of the present application, the network information may be obtained from a communication server, and the communication server may obtain the first historical network behavior data from an Authentication, authorization, and Accounting (AAA) Authentication server and a Deep Packet Inspection (DPI) device. Specifically, the communication server acquires user identity information and user private network address information from the AAA authentication server, and acquires first historical network behavior data of a corresponding user from the DPI device according to the user private network address information.
In some exemplary embodiments, the historical target industry data includes any one of:
historical power data, historical tap water data, historical natural gas data, historical advertisement data and historical express data.
In some exemplary embodiments, the historical target industry data may be obtained from a private network corresponding to the target industry, may also be obtained by manual collection, or may also be obtained by any other manner.
In the embodiment of the application, for different industries, the first historical network behavior data and the historical target industry data can be different data, and which data are required for model training can be determined according to the actual application scenario of the industry. For example, for the power grid industry, if the total power consumption in the target area needs to be predicted, the first historical network behavior data is historical network behavior data in the target area, and the historical target industry data is historical total power consumption in the target area. In this case, the communication server acquires the user identity information and the user private network address information in the target area from the AAA authentication server, and acquires first historical network behavior data of the corresponding user from the DPI device according to the user private network address information.
If abnormal electricity utilization behaviors of the user need to be detected, the first historical network behavior data are historical network behavior data of the user, and the historical target industry data are historical power data of the user. Under the condition, the communication server acquires the user identity information and the user private network address information from the AAA authentication server, and acquires first historical network behavior data of the corresponding user from the DPI equipment according to the user private network address information. Where the historical power data may be historical power usage.
In some exemplary embodiments, the Subscriber Identity information may be information corresponding to the subscribers one to one, such as a Number of the Mobile terminal, an International Mobile Equipment Identity (IMEI), an International Mobile Subscriber Identity (IMSI), and the like.
In some example embodiments, the first historical network behavior data may be network behavior data in any one or more communication systems. The communication system may be, for example, a mobile communication system (e.g., a mobile terminal communication system, a car networking communication system, other internet of things communication system, etc.), a fixed Network communication system (e.g., a home Wireless Fidelity (WiFi), a business WiFi, a Virtual Private Network (VPN), etc.). The coverage areas of different communication systems are different. For example, mobile terminal communication systems and vehicle networking communication systems cover residential areas, business areas, enterprise parks and the like. Different communication systems also have different characteristics. For example, in a mobile communication system, an Internet Protocol (IP) address of a terminal is often bound to a specific user, and has mobility characteristics, and has some specific characteristics in different areas and time period scenarios; while a fixed network communication system usually has one IP address bearing a plurality of user data, with the development of the network era, various facilities of families, businesses and enterprises are gradually intelligentized, and the user data and the device data contain wide characteristic information.
In some example embodiments, the first historical network behavior data may be a historical data flow five tuple, such as time, traffic, number of packets, uniform Resource Locator (URL), application type, and the like.
In other exemplary embodiments, the first historical network behavior data may also be data obtained by data processing of a five-tuple of the data stream. For example, the feature data extracted from the data flow quintuple, such as a prediction model for the total power consumption in the target area, may be at least one of the number of users in the target area, a change rule of the number of users in the target area, and a behavior period (e.g., a sleep period, a leisure period, a housework period, and a work period) of the users in the target area. For the user electricity consumption abnormal behavior detection model, the characteristic data can be the inherent attribute of the position where the user is located, the prediction attribute of the position where the user is located, the actual user number corresponding to the user and the equipment operation characteristic.
For the number of users, since one IP address in the fixed network communication system corresponds to a plurality of users, the number of users corresponding to one IP address in the fixed network communication system needs to be evaluated. For example, home WiFi and enterprise WiFi are all used by multiple users to share one IP address for internet access. Although the mobile terminal is usually in one-to-one correspondence with the user, the number of users in the target area can be predicted according to the movement characteristic of the mobile terminal because the mobile terminal has the movement characteristic.
The inherent attribute of the user location means whether the user location is a work area or a residential area. The prediction attribute for the user position refers to whether the user position is a working area or a living area according to the network behavior data analysis prediction. For example, if a user is stationary during the day and leaves a house in his residential area, the home's intrinsic property is the residential area and the predictive property is the work area.
The device operation characteristics may be a device type, a habitual use period, power consumption, and the like.
In some exemplary embodiments, the target area may be divided according to characteristics of users in different areas, for example, a small area with the same work habit is divided into the same target area, for example, a large enterprise campus mainly focuses on the property of white-collar daytime work, and has a large power demand in daytime and a small power demand at night; if a large-scale living parcel in a district is 1-2 hours earlier than people in other areas in outgoing and 1-2 hours later, the demands of the large-scale living parcel in the district and other areas on electric power are different, and the large-scale living parcel in the district needs to be divided into different target areas to be subjected to model training respectively.
Step 101, performing model training according to network information and historical target industry data to obtain a data processing model.
In the embodiment of the application, all source data can be acquired to perform model training in the target industry server, part of the source data can be acquired to perform model training in the target industry server, the other part of the source data can be subjected to model training in the communication server, and a training result obtained by performing model training according to the source data in the communication server can be directly acquired without acquiring the source data.
In some exemplary embodiments, for a case where model training is performed in the target industry server to acquire all source data, the network information is first historical network behavior data, and performing model training according to the network information and the historical target industry data to obtain the data processing model includes: and determining a first training sample according to the first historical network behavior data and the historical target industry data, and performing model training according to the first training sample to obtain a data processing model.
In some exemplary embodiments, the first training sample may be updated according to user characteristics input by the user.
In further exemplary embodiments, for a case where a portion of the source data is obtained for model training in the target industry server and another portion of the source data is obtained for model training in the communication server, the first historical network behavior data includes: second historical network behavior data and third historical network behavior data, the network information including: the second historical network behavior data, the first training result obtained by performing model training according to the third historical network behavior data, and the data processing model obtained by performing model training according to the network information and the historical target industry data comprise: and determining a second training sample according to the second historical network behavior data and the historical target industry data, and performing model training according to the second training sample and the first training result to obtain a data processing model. Specifically, a federal learning method, such as a vertical federal learning method, can be used for model training in the communication server and the target industry server.
In some exemplary embodiments, the second training sample may be updated according to user characteristics input by the user.
In other exemplary embodiments, for a case where a training result obtained by performing model training according to source data in a communication server is directly obtained without obtaining the source data, the network information is a second training result obtained by performing model training according to first historical network behavior data, and performing model training according to the network information and historical target industry data to obtain a data processing model includes: and performing model training according to the second training result and the historical target industry data to obtain a data processing model. Specifically, a federal learning method, such as a vertical federal learning method, can be used for model training in the communication server and the target industry server.
In the embodiment of the application, the data processing model is different for different industries. The data processing model is not limited in the embodiments of the present application. For example, for the power grid industry, the data processing model may be a total power consumption prediction model; or the data processing model is an electricity utilization abnormal behavior detection model; or the data processing model is a power supply demand prediction model of the charging station. For the tap water industry, the data processing model can be a total water consumption prediction model.
In the embodiment of the application, when the data processing model is the total power consumption prediction model, network information is used as the input of the data processing model, and historical total power consumption in a target area is used as the output of the data processing model to perform model training; and when the data processing model is a power utilization abnormal behavior detection model, taking the network information and the historical power consumption as the input of the data processing model, and taking whether the user has a power utilization abnormal behavior as the output of the data processing model to carry out model training.
The embodiment of the application can adopt any one of methods such as a machine learning algorithm, a neural network, a Long Short Term Memory (LSTM) network and the like to train the model.
According to the model training method provided by the embodiment of the application, the historical network behavior data of the communication system and the historical target industry data are combined to perform model training to obtain the data processing model, so that the accuracy and the breadth of the data processing model of the target industry are greatly improved. For example, the traditional power industry only has power data and lacks network behavior data of users, and the feature data used for model training are single, while the model training method of the embodiment of the application combines the network behavior data of the users to perform model training, so that the dimensionality of the feature data used for model training is increased, and the accuracy is effectively improved.
Fig. 2 is a flowchart of a data processing method according to another embodiment of the present application.
In a second aspect, referring to fig. 2, another embodiment of the present application provides a data processing method applied to a target industry server, where the method includes:
and 200, acquiring real-time network behavior data.
In some exemplary embodiments, real-time network behavior data and real-time targeted industry data are obtained.
In the embodiment of the application, the communication server is arranged in the communication system, and the communication server can obtain real-time network behavior data from the communication system. After the communication server obtains the real-time historical network behavior data, the target industry server can directly obtain the real-time network behavior data from the communication server.
In this embodiment of the present application, the real-time network behavior data may be obtained from a communication server, and the communication server may obtain the real-time network behavior data from an AAA authentication server and a DPI device. Specifically, the communication server acquires user identity information and user private network address information from the AAA authentication server, and acquires real-time network behavior data of a corresponding user from the DPI equipment according to the user private network address information.
In some exemplary embodiments, the real-time targeted industry data includes any one of:
real-time electric power data, real-time tap water data, real-time natural gas data, real-time advertisement data and real-time express delivery data.
In some exemplary embodiments, the real-time target industry data may be obtained from a private network corresponding to the target industry, may be obtained by manual collection, or may be obtained by any other method.
In the embodiment of the application, for different industries, the real-time network behavior data and the real-time target industry data can be different data, and which data are required to be processed can be determined according to the actual application scene of the industry. For example, for the power grid industry, if it is desired to predict the total power usage in the target area, the real-time network behavior data is the real-time network behavior data in the target area. Under the condition, the communication server acquires the user identity information and the user private network address information in the target area from the AAA authentication server, and acquires the real-time network behavior data of the corresponding user from the DPI equipment according to the user private network address information.
And if the abnormal electricity utilization behavior of the user needs to be detected, the real-time network behavior data is the real-time network behavior data of the user, and the real-time target industry data is the real-time electricity data of the user. Under the condition, the communication server acquires the user identity information and the user private network address information from the AAA authentication server, and acquires the real-time network behavior data of the corresponding user from the DPI equipment according to the user private network address information. Wherein the real-time power data may be real-time power usage.
In some exemplary embodiments, the user identity information may be information corresponding to the user one to one, such as the number, IMEI, IMSI, etc. of the mobile terminal.
In some example embodiments, the real-time network behavior data may be network behavior data in any one or more communication systems. The communication system may be, for example, a mobile communication system (e.g., a mobile terminal communication system, a car networking communication system, other internet of things communication system, etc.), a fixed network communication system (e.g., home WiFi, business WiFi, enterprise network VPN, etc.). Different communication systems have different coverage areas. For example, mobile terminal communication systems and car networking communication systems cover residential areas, business areas, enterprise parks and the like. Different communication systems also have different characteristics. For example, in a mobile communication system, an IP address of a terminal is often bound to a specific user, and the terminal has a mobility characteristic and has some specific characteristics in different areas and time period scenes; while a fixed network communication system usually has one IP address bearing a plurality of user data, with the development of the network era, various facilities of families, businesses and enterprises are gradually intelligentized, and the user data and the device data contain wide characteristic information.
In some example embodiments, the real-time network behavior data may be historical data flow quintuple, such as time, traffic, number of packets, URL, application type, and the like.
In other exemplary embodiments, the real-time network behavior data may also be data obtained by performing data processing on the data stream quintuple. For example, the feature data extracted from the data stream quintuple, such as a prediction model for the total power consumption in the target area, may be at least one of the number of users in the target area, the change rule of the number of users in the target area, and the behavior period (e.g., sleep period, leisure period, housework period) of the users in the target area. For the user electricity consumption abnormal behavior detection model, the characteristic data can be the inherent attribute of the position where the user is located, the prediction attribute of the position where the user is located, the actual user number corresponding to the user and the equipment operation characteristic.
For the number of users, since one IP address in the fixed network communication system corresponds to a plurality of users, the number of users corresponding to one IP address in the fixed network communication system needs to be evaluated. For example, home WiFi, enterprise WiFi, etc. all use one IP address for multiple users to access the internet. Although the mobile terminal is usually in one-to-one correspondence with the user, the number of users in the target area can be predicted according to the movement characteristic of the mobile terminal because the mobile terminal has the movement characteristic.
The inherent attribute of the user location means whether the user location is a work area or a residential area. The prediction attribute for the user position refers to whether the user position is a working area or a residential area according to the network behavior data analysis prediction. For example, if a user is stationary during the day and leaves a house in his residential area, the home's intrinsic property is the residential area and the predictive property is the work area.
The device operation characteristics may be a device type, a habitual use period, power consumption, and the like.
In some exemplary embodiments, the target area may be divided according to characteristics of users in different areas, for example, a small area with the same work habit is divided into the same target area, for example, a large enterprise campus mainly focuses on the property of white-collar daytime work, and has a large power demand in daytime and a small power demand at night; if a large living parcel in a community is 1-2 hours earlier than the travel time of people in other areas and 1-2 hours later, the demand of the large living parcel in the community on electric power is different from that of other areas, and the large living parcel in the community needs to be divided into different target areas to be respectively trained.
Step 201, processing the real-time network behavior data by using a data processing model obtained by training any one of the above model training methods to obtain a processing result.
In the embodiment of the application, the data processing model is different for different industries. The embodiment of the present application does not limit the data processing model. For example, for the power grid industry, the data processing model may be a total power consumption prediction model; or the data processing model is an electricity abnormal behavior detection model. For the tap water industry, the data processing model can be a total water consumption prediction model.
In some exemplary embodiments, when the data processing model is a total power consumption prediction model, the processing of the real-time network behavior data to obtain the processing result includes: and predicting the total power consumption in the target area according to the real-time network behavior data by adopting a total power consumption prediction model.
In some exemplary embodiments, the data processing model obtained by training with any one of the above-mentioned model training methods is used to process the real-time network behavior data and the real-time target industry data to obtain a processing result.
In some exemplary embodiments, when the data processing model is an abnormal electricity consumption behavior detection model, the processing of the real-time network behavior data and the real-time target industry data to obtain the processing result includes: and determining whether the user has the abnormal power utilization behavior according to the real-time network behavior data and the real-time target industry data by adopting the abnormal power utilization behavior detection model.
In some exemplary embodiments, after the real-time network behavior data and the real-time target industry data are processed to obtain the processing result, the method further includes: and carrying out intelligent control according to the processing result.
In some exemplary embodiments, when the data processing model is a total power consumption prediction model, the performing the intelligent control according to the processing result includes: and carrying out scheduling control on the electric power distributed into the target area according to the predicted total electric power consumption in the target area.
According to the data processing method provided by the embodiment of the application, the data processing model is adopted to process the real-time network behavior data to obtain the processing result, and the processing accuracy of the target industry data is greatly improved.
In a third aspect, an embodiment of the present application provides an electronic device, including:
at least one processor;
a memory having at least one program stored thereon, the at least one program, when executed by the at least one processor, implementing any of the above-described model training methods, or any of the above-described data processing methods.
Wherein, the processor is a device with data processing capability, which includes but is not limited to a Central Processing Unit (CPU) and the like; memory is a device with data storage capabilities including, but not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), FLASH memory (FLASH).
In some embodiments, the processor, memory, and the like are interconnected by a bus, which in turn connects with other components of the computing device.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any of the above-mentioned model training methods, or any of the above-mentioned data processing methods.
FIG. 3 is a block diagram of a data processing system according to another embodiment of the present application.
In a fifth aspect, another embodiment of the present application provides a data processing system, including: a target industry server 301 and a communication server 302.
Wherein, the target industry server 301 comprises: the system comprises a network docking module 3011, a target industry data training module 3012, an inference module 3013, a business decision module 3014, a target industry intelligent control module 3015, a target industry data acquisition module 3016 and a foreground module 3017.
The functions of the respective modules are described below.
The target industry data acquisition module 3016 is configured to acquire target industry data, where the acquired target industry data includes: historical target industry data, or historical target industry data and real-time target industry data.
The network docking module 3011 is configured to communicate with the target industry docking module 3021 to obtain network information and real-time network behavior data.
The foreground module 3017 is configured to input a user characteristic according to a requirement of an actual application scenario, as a supplement to the first historical network behavior data. In some exemplary embodiments, the foreground module 3017 is further configured to display the intelligent control policy determined by the business decision module 3014 to a user, send the intelligent control policy input by the user to the target industry intelligent control module 3015, or send the policy adjustment information input by the user to the business decision module 3014.
The target industry data training module 3012 is configured to perform model training according to the network information and the historical target industry data to obtain a data processing model.
The inference module 3013 is configured to process the real-time network behavior data to obtain a processing result, or process the real-time network behavior data and the real-time target industry data to obtain a processing result, using the trained data processing model.
The service decision module 3014 is configured to determine an intelligent control policy according to the processing result. In some exemplary embodiments, the traffic decision module 3014 is further configured to adjust the intelligent control policy according to the policy adjustment information input by the user.
The target industry intelligent control module 3015 is configured to perform intelligent control according to the intelligent control policy determined by the business decision module 3014, or perform intelligent control according to the intelligent control policy input by the user.
Among them, the communication server 302 includes: a target industry docking module 3021, a network data collection module 3022, and a network data training module 3023.
The functions of the respective modules are described below.
The network data collection module 3022 is configured to obtain user identity information and user private network address information from the AAA authentication server, and collect first historical network behavior data of a corresponding user from the DPI device according to the user private network address information.
The network data training module 3023 is configured to perform model training according to the third history network behavior data to obtain a first training result, and send the first training result to the target industry docking module 3021; or, performing model training according to the first historical network behavior data to obtain a second training result, and sending the second training result to the target industry docking module 3021.
The target industry docking module 3021 is configured to send network information to the network docking module 3011; the network information is first historical network behavior data; alternatively, the network information includes: performing model training on third historical network behavior data in the first historical network behavior data to obtain a first training result and second historical network behavior data in the first historical network behavior data; alternatively, the network information includes: and performing model training according to the first historical network behavior data to obtain a second training result.
The specific implementation process of each module is the same as that of the model training method and the data processing method in the foregoing embodiment, and is not described here again.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as is well known to those skilled in the art.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the application as set forth in the appended claims.

Claims (16)

1. A model training method, comprising:
acquiring network information and historical target industry data; wherein the network information is obtained from first historical network behavior data;
and performing model training according to the network information and the historical target industry data to obtain a data processing model.
2. The model training method of claim 1, wherein the network information is the first historical network behavior data; the step of performing model training according to the network information and the historical target industry data to obtain a data processing model comprises the following steps:
and determining a first training sample according to the first historical network behavior data and the historical target industry data, and performing model training according to the first training sample to obtain the data processing model.
3. The model training method of claim 1, wherein the first historical network behavior data comprises: second historical network behavior data and third historical network behavior data, the network information comprising: the second historical network behavior data and a first training result obtained by performing model training according to the third historical network behavior data;
the obtaining of the data processing model by performing model training according to the network information and the historical target industry data comprises:
and determining a second training sample according to the second historical network behavior data and the historical target industry data, and performing model training according to the second training sample and the first training result to obtain the data processing model.
4. The model training method of claim 3, wherein the second historical network behavior data is non-private data of the first historical network behavior data, and the third historical network behavior data is private data of the first historical network behavior data.
5. The model training method according to claim 1, wherein the network information is a second training result obtained by performing model training according to the first historical network behavior data;
the obtaining of the data processing model by performing model training according to the network information and the historical target industry data comprises:
and performing model training according to the second training result and the historical target industry data to obtain a data processing model.
6. The model training method of claim 1, wherein the historical target industry data comprises any one of:
historical power data, historical tap water data, historical natural gas data, historical advertisement data and historical express delivery data.
7. The model training method according to any one of claims 1 to 6, wherein the first historical network behavior data is historical network behavior data in a target area, the historical target industry data is historical total power consumption in the target area, and the data processing model is a total power consumption prediction model.
8. The model training method according to any one of claims 1 to 6, wherein the first historical network behavior data is historical network behavior data of a user, the historical target industry data is historical power data of the user, and the data processing model is a power abnormal behavior detection model.
9. A method of data processing, comprising:
acquiring real-time network behavior data;
and processing the real-time network behavior data by adopting a data processing model obtained by training the model training method according to any one of claims 1 to 8 to obtain a processing result.
10. The data processing method of claim 9, wherein after the processing the real-time network behavior data to obtain the processing result, the method further comprises: and carrying out intelligent control according to the processing result.
11. The data processing method of claim 10, wherein the real-time network behavior data is real-time network behavior data within a target area, and the data processing model is a total power consumption prediction model;
the data processing model obtained by training with the model training method according to any one of claims 1 to 8, wherein the processing of the real-time network behavior data to obtain a processing result comprises: predicting the total power consumption in the target area according to the real-time network behavior data by adopting the total power consumption prediction model;
the performing intelligent control according to the processing result comprises: and carrying out scheduling control on the electric power distributed to the target area according to the predicted total electric power consumption in the target area.
12. The data processing method of claim 9, wherein before the processing the real-time network behavior data to obtain the processing result, the method further comprises: acquiring real-time target industry data;
the processing the real-time network behavior data to obtain a processing result comprises: and processing the real-time network behavior data and the real-time target industry data to obtain a processing result.
13. The data processing method of claim 12, wherein the real-time network behavior data is real-time network behavior data of a user, the real-time target industry data is real-time target industry data of the user, and the data processing model is a power consumption abnormal behavior detection model;
the processing of the real-time network behavior data and the real-time target industry data to obtain a processing result by using the data processing model trained by the model training method according to any one of claims 1 to 8 comprises: and determining whether the user has the abnormal electricity utilization behavior according to the real-time network behavior data and the real-time target industry data by adopting the abnormal electricity utilization behavior detection model.
14. The data processing method of any of claims 9 to 13, wherein the real-time target industry data comprises any of:
real-time electric power data, real-time tap water data, real-time natural gas data, real-time advertisement data and real-time express delivery data.
15. An electronic device, comprising:
at least one processor;
memory having stored thereon at least one program which, when executed by the at least one processor, implements the model training method of any one of claims 1-8 or the data processing method of any one of claims 9-14.
16. A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the model training method of any one of claims 1-8, or the data processing method of any one of claims 9-14.
CN202111030195.0A 2021-09-02 2021-09-02 Model training method, data processing method, electronic device and readable storage medium Pending CN115759223A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111030195.0A CN115759223A (en) 2021-09-02 2021-09-02 Model training method, data processing method, electronic device and readable storage medium
PCT/CN2022/109443 WO2023029853A1 (en) 2021-09-02 2022-08-01 Model training method, data processing method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111030195.0A CN115759223A (en) 2021-09-02 2021-09-02 Model training method, data processing method, electronic device and readable storage medium

Publications (1)

Publication Number Publication Date
CN115759223A true CN115759223A (en) 2023-03-07

Family

ID=85332897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111030195.0A Pending CN115759223A (en) 2021-09-02 2021-09-02 Model training method, data processing method, electronic device and readable storage medium

Country Status (2)

Country Link
CN (1) CN115759223A (en)
WO (1) WO2023029853A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041073B (en) * 2023-09-05 2024-05-28 广州天懋信息系统股份有限公司 Network behavior prediction method, system, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153887A (en) * 2017-04-14 2017-09-12 华南理工大学 A kind of mobile subscriber's behavior prediction method based on convolutional neural networks
CN111797858A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Model training method, behavior prediction method, device, storage medium and equipment
US11415438B2 (en) * 2019-07-17 2022-08-16 ExxonMobil Technology and Engineering Company Intelligent system for identifying sensor drift
CN112801374B (en) * 2021-01-29 2023-01-13 广东晨兴智能科技有限公司 Model training method, power load prediction method, device and equipment

Also Published As

Publication number Publication date
WO2023029853A1 (en) 2023-03-09

Similar Documents

Publication Publication Date Title
CN110505196B (en) Internet of things network card abnormality detection method and device
Wang et al. Vehicular sensing networks in a smart city: Principles, technologies and applications
Khatoun et al. Smart cities: concepts, architectures, research opportunities
Liu et al. Fraud detection from taxis' driving behaviors
EP2905931B1 (en) Method and apparatus for determining data flow rate of service access port
CN106875066A (en) With the Forecasting Methodology of car travel behaviour, device, server and storage medium
Holleczek et al. Detecting weak public transport connections from cellphone and public transport data
Wang et al. TrafficChain: A blockchain-based secure and privacy-preserving traffic map
Chen et al. Secure centralized spectrum sensing for cognitive radio networks
WO2016070673A1 (en) Method and device for analyzing user attribute
Inagaki et al. Prioritization of mobile IoT data transmission based on data importance extracted from machine learning model
Desai et al. A pattern analysis of daily electric vehicle charging profiles: Operational efficiency and environmental impacts
CN108769926B (en) Group perception layer-based car networking privacy protection method and car networking framework
CN107872767A (en) A kind of net about car brush single act recognition methods and identifying system
CN106779874A (en) Share method for pushing, user terminal and the system of bicycle
Liu et al. Efficient 3g budget utilization in mobile participatory sensing applications
CN111131493B (en) Data acquisition method and device and user portrait generation method and device
Demissie et al. Inferring origin-destination flows using mobile phone data: A case study of Senegal
Zhang et al. AI-based security design of mobile crowdsensing systems: Review, challenges and case studies
Baier et al. MapCorrect: Automatic correction and validation of road maps using public sensing
Magsino et al. Roadside unit allocation for fog-based information sharing in vehicular networks
CN112653989A (en) Broadband user positioning method, device, electronic equipment and storage medium
Milojevic et al. Location aware data aggregation for efficient message dissemination in vehicular ad hoc networks
WO2023029853A1 (en) Model training method, data processing method, electronic device, and computer-readable storage medium
Přibyl et al. Definition of a smart street as smart city's building element

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination