CN114862488A

CN114862488A - Identification method of resource consumption abnormal object and related device

Info

Publication number: CN114862488A
Application number: CN202111676222.1A
Authority: CN
Inventors: 容汉铿; 曾凡; 聂利权
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-02-04
Filing date: 2021-12-31
Publication date: 2022-08-05

Abstract

The application discloses a method and a related device for identifying a resource consumption abnormal object, which are applied to the field of artificial intelligence. By obtaining an unlabeled sample and a positive sample; then determining the dimension of the target feature; inputting the unlabeled sample and the positive sample into a semi-supervised learning frame to carry out iterative training on the classification model; and obtaining a plurality of identification characteristic values corresponding to unlabeled samples output by the classification model in the iterative training process to determine the resource consumption abnormal object. The method has the advantages that the identification process of the resource consumption abnormal object under a small amount of positive samples is realized, and the positive samples in the unmarked samples are mined in a heuristic manner continuously in the training process for supplement and added into the next iteration, so that the problem of sample imbalance in the identification scene is effectively solved, and the accuracy of the identification of the resource consumption abnormal object is improved.

Description

Identification method of resource consumption abnormal object and related device

The present application claims priority of chinese patent application entitled "a semi-supervised learning based group rental housing identification method and related apparatus" filed on 4/2/2021 by the chinese patent office, application No. 202110154406.5, which is incorporated herein by reference in its entirety.

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and a related apparatus for identifying an abnormal object of resource consumption.

Background

With the rapid flow of the population, the phenomenon of population in cities is more and more serious. The problems of house reconstruction, complex circuit, fire hazard and the like of the group renting house have great harm, and how to identify the group renting house based on limited data becomes a difficult problem, and the group renting house object is an abnormal object of resource consumption.

Generally, a machine learning-based group rental housing identification method can be adopted, namely, a classification algorithm of machine learning is utilized to fit the group rental housing data, and the group rental housing and the non-group rental housing samples are effectively distinguished.

However, a large amount of data support is required in the machine learning-based group tenant identification process, and due to the limitation of the disclosed data, only a small amount of positive samples and a large amount of unmarked data are often included in the group tenant data, so that the accuracy and effectiveness of group tenant identification are reduced, and the accuracy of resource consumption abnormal object identification is affected.

Disclosure of Invention

In view of this, the present application provides a method for identifying an abnormal object of resource consumption, which can effectively improve the accuracy of identifying the abnormal object of resource consumption.

A first aspect of the present application provides a method for identifying an abnormal object of resource consumption, which can be applied to a system or a program including a group renting room identification function in a terminal device, and specifically includes:

acquiring resource usage data of a target object set as unmarked samples, and calling resource usage data corresponding to objects which are verified as group tenants as positive samples, wherein the number of the positive samples is less than that of the unmarked samples;

performing feature extraction on the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on resource use features under a preset time granularity;

inputting the unlabeled samples and the positive samples into a semi-supervised learning framework to perform iterative training on a classification model in the semi-supervised learning framework based on the target feature dimension, wherein the unlabeled samples meeting preset conditions in the iterative training process are labeled as supplementary samples, and the supplementary samples are used for updating the positive samples;

and acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the group renting room objects in the target object set.

Optionally, in some possible implementations of the present application, the obtaining resource usage data of the target object set as an unlabeled sample includes:

determining candidate data corresponding to the target object set in a data statistical range;

dividing the candidate data based on the preset time granularity to obtain granularity data;

counting resource characteristic items in the granularity data to obtain resource use data corresponding to the target object set;

pre-processing the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementations of the present application, the preprocessing the resource usage data to obtain the unlabeled sample includes:

determining the minimum granularity in the preset time granularities;

traversing the resource usage data based on the minimum granularity to determine a null term and a negative term;

and calling a replacement value to replace the vacancy item and the negative value item so as to preprocess the resource use data to obtain the unmarked sample.

acquiring the average value number in the resource use data;

determining salient terms in the resource usage data that exceed the average number;

and replacing the numerical value of the salient item with the average value number so as to preprocess the resource use data to obtain the unmarked sample.

determining a data corresponding relation in the resource use data;

extracting abnormal items in the data corresponding relation;

and screening the coincidence numerical values in the abnormal items so as to preprocess the resource use data to obtain the unmarked sample.

Optionally, in some possible implementations of the present application, the performing feature extraction on the positive sample based on at least one feature item to obtain a target feature dimension includes:

determining numerical characteristics corresponding to the characteristic items;

performing feature extraction on the positive sample based on the numerical features to obtain numerical span information;

and determining the target feature dimension according to the numerical span information.

Optionally, in some possible implementations of the present application, the method further includes:

correlating the numerical characteristics within a preset time range to obtain fluctuation characteristics;

performing feature extraction on the positive sample based on the fluctuation features to obtain a feature fluctuation range;

and determining the target feature dimension according to the feature fluctuation range.

determining a characteristic time interval corresponding to the numerical characteristic;

performing feature extraction on the positive sample based on the feature time interval to obtain time interval resource use information;

and determining the target characteristic dimension according to the time interval resource use information.

comparing adjacent numerical characteristics to obtain periodic characteristics;

analyzing the positive sample based on the period characteristics to obtain a characteristic period;

and determining the target feature dimension according to the feature period.

Optionally, in some possible implementations of the present application, the inputting the unlabeled sample and the positive sample into a semi-supervised learning framework to iteratively train a classification model in the semi-supervised learning framework based on the target feature dimension includes:

generating a training set based on the unlabeled sample and the positive sample;

inputting the training set into the semi-supervised learning framework, and randomly drawing a part of samples from the unlabeled samples as negative samples;

training a preset model based on the positive sample and the negative sample to obtain the classification model;

identifying the non-extracted unlabeled samples according to the classification model to obtain an identification characteristic value corresponding to each sample in the non-extracted unlabeled samples;

screening the identification characteristic value based on the preset condition to extract a supplementary sample from the unlabeled sample, and updating the positive sample based on the supplementary sample;

repeating the process of random extraction to perform the iterative training on the classification model in the semi-supervised learning framework based on the target feature dimension.

determining special data of the resource use data corresponding to the target object set under different data dimensions;

respectively obtaining a plurality of predicted values corresponding to the special data;

performing weighted calculation on the plurality of predicted values to obtain a target characteristic value;

determining a group tenant object in the set of target objects based on the target feature value.

Optionally, in some possible implementation manners of the present application, the target object set is a cell user set, the resource usage data is power consumption, the positive sample is derived from an executable third-party platform, and the third-party platform is used for monitoring the group renting room object.

A second aspect of the present application provides an apparatus for identifying an abnormal object of resource consumption, including:

the acquiring unit is used for acquiring the resource use data of the target object set as unmarked samples and calling the resource use data which is verified as the resource consumption abnormal object as positive samples, wherein the number of the positive samples is less than that of the unmarked samples;

the extraction unit is used for extracting the features of the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on the resource use features under the preset time granularity;

a training unit, configured to input the unlabeled samples and the positive samples into a semi-supervised learning framework, so as to perform iterative training on a classification model in the semi-supervised learning framework based on the target feature dimensions, where unlabeled samples meeting preset conditions in the iterative training process are labeled as complementary samples, and the complementary samples are used to update the positive samples;

and the identification unit is used for acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object in the target object set.

Optionally, in some possible implementation manners of the present application, the obtaining unit is specifically configured to determine candidate data corresponding to the target object set within a data statistics range;

the obtaining unit is specifically configured to divide the candidate data based on the preset time granularity to obtain granularity data;

the acquiring unit is specifically configured to count resource feature items in the granularity data to obtain resource usage data corresponding to the target object set;

the obtaining unit is specifically configured to perform preprocessing on the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit is specifically configured to determine a minimum granularity in the preset time granularities;

the obtaining unit is specifically configured to traverse the resource usage data based on the minimum granularity to determine a null item and a negative item;

the obtaining unit is specifically configured to call a replacement value to replace the vacancy item and the negative value item, so as to preprocess the resource usage data to obtain the unmarked sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit is specifically configured to obtain an average number in the resource usage data;

the acquiring unit is specifically configured to determine a salient item exceeding the average number in the resource usage data;

the obtaining unit is specifically configured to replace the numerical value of the protruding item with the mean value to perform preprocessing on the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit is specifically configured to determine a data correspondence relationship in the resource usage data;

the acquisition unit is specifically used for extracting abnormal items in the data corresponding relation;

the acquiring unit is specifically configured to screen a coincidence value in the abnormal item, so as to preprocess the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementation manners of the present application, the extracting unit is specifically configured to determine a numerical feature corresponding to the feature item;

the extraction unit is specifically configured to perform feature extraction on the positive sample based on the numerical features to obtain numerical span information;

the extraction unit is specifically configured to determine the target feature dimension according to the numerical span information.

Optionally, in some possible implementation manners of the present application, the extracting unit is specifically configured to correlate the numerical features within a preset time range to obtain a fluctuation feature;

the extraction unit is specifically configured to perform feature extraction on the positive sample based on the fluctuation feature to obtain a feature fluctuation range;

the extracting unit is specifically configured to determine the target feature dimension according to the feature fluctuation range.

Optionally, in some possible implementations of the present application, the extracting unit is specifically configured to determine a feature time period corresponding to the numerical feature;

the extraction unit is specifically configured to perform feature extraction on the positive sample based on the feature time interval to obtain time interval resource usage information;

the extracting unit is specifically configured to determine the target feature dimension according to the time interval resource usage information.

Optionally, in some possible implementation manners of the present application, the extracting unit is specifically configured to compare adjacent numerical features to obtain a periodic feature;

the extraction unit is specifically configured to analyze the positive sample based on the period feature to obtain a feature period;

the extracting unit is specifically configured to determine the target feature dimension according to the feature period.

Optionally, in some possible implementations of the present application, the training unit is specifically configured to generate a training set based on the unlabeled sample and the positive sample;

the training unit is specifically used for inputting the training set into the semi-supervised learning framework and randomly extracting a part of samples from the unlabeled samples as negative samples;

the training unit is specifically configured to train a preset model based on the positive sample and the negative sample to obtain the classification model;

the training unit is specifically configured to identify an unextracted unlabeled sample according to the classification model to obtain an identification feature value corresponding to each sample in the unextracted unlabeled sample;

the training unit is specifically configured to screen the identification feature value based on the preset condition, to extract a supplementary sample from the unlabeled sample, and to update the positive sample based on the supplementary sample;

the training unit is specifically configured to repeat the process of random extraction, so as to perform the iterative training on the classification model in the semi-supervised learning framework based on the target feature dimension.

Optionally, in some possible implementation manners of the present application, the identification unit is specifically configured to determine special data of the resource usage data corresponding to the target object set in different data dimensions;

the identification unit is specifically configured to obtain a plurality of predicted values corresponding to the special data respectively;

the identification unit is specifically configured to perform weighted calculation on the plurality of predicted values to obtain a target feature value;

the identification unit is specifically configured to determine, based on the target feature value, a resource consumption abnormal object in the target object set.

A third aspect of the present application provides a computer device comprising: a memory, a processor, and a bus system; the memory is used for storing program codes; the processor is configured to execute the group rental housing identification method according to any one of the first aspect or the first aspect according to an instruction in the program code.

A fourth aspect of the present application provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the group rental housing identifying method according to the first aspect or any one of the first aspects.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. A processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the group rental housing identification method provided in the first aspect or the various alternative implementations of the first aspect.

According to the technical scheme, the embodiment of the application has the following advantages:

acquiring resource use data of a target object set as unmarked samples, and calling the resource use data corresponding to the object which is verified as the resource consumption abnormal object as positive samples, wherein the number of the positive samples is less than that of the unmarked samples; then, extracting features of the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on resource use features under preset time granularity; inputting unlabeled samples and positive samples into a semi-supervised learning frame, and performing iterative training on a classification model in the semi-supervised learning frame based on target characteristic dimensions, wherein the unlabeled samples meeting preset conditions in the iterative training process are labeled as supplementary samples, and the supplementary samples are used for updating the positive samples; and further acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object objects in the target object set. The method has the advantages that the identification process of the resource consumption abnormal object under a small amount of positive samples is realized, the characteristic of the resource consumption abnormal object is extracted by adopting a plurality of target characteristic dimensions, the sensitivity of the classification model to the characteristic of the resource consumption abnormal object can be ensured, the positive samples in the unmarked samples are continuously mined in a heuristic manner in the training process for supplement and added into the next iteration, the problem of sample imbalance in the identification scene of the resource consumption abnormal object is effectively solved, and the identification accuracy of the resource consumption abnormal object is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a network architecture diagram of a group rental housing identification system;

fig. 2 is a flowchart of group tenant identification according to an embodiment of the present disclosure;

fig. 3 is a flowchart of a method for identifying an abnormal object of resource consumption according to an embodiment of the present application;

fig. 4 is a scene schematic diagram of an identification method for an abnormal object of resource consumption according to an embodiment of the present application;

FIG. 5 is a flowchart of another method for identifying an abnormal object of resource consumption according to an embodiment of the present application;

FIG. 6 is a flowchart of another method for identifying an abnormal object of resource consumption according to an embodiment of the present application;

FIG. 7 is a flowchart of another method for identifying an abnormal object of resource consumption according to an embodiment of the present application;

fig. 8 is a scene schematic diagram of another identification method for resource consumption abnormal objects according to the embodiment of the present application;

fig. 9 is a schematic structural diagram of an apparatus for identifying an abnormal object of resource consumption according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

The embodiment of the application provides an identification method and a related device of an abnormal object of resource consumption, which can be applied to a system or a program containing a group renting house identification function in terminal equipment, and the method comprises the steps of acquiring resource use data of a target object set as unmarked samples, and calling the resource use data corresponding to an object which is verified as an abnormal object of resource consumption as positive samples, wherein the number of the positive samples is less than that of the unmarked samples; then, extracting features of the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on resource use features under preset time granularity; inputting unlabeled samples and positive samples into a semi-supervised learning frame, and performing iterative training on a classification model in the semi-supervised learning frame based on target characteristic dimensions, wherein the unlabeled samples meeting preset conditions in the iterative training process are labeled as supplementary samples, and the supplementary samples are used for updating the positive samples; and further acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object objects in the target object set. The method has the advantages that the identification process of the resource consumption abnormal object under a small amount of positive samples is realized, the characteristic of the resource consumption abnormal object is extracted by adopting a plurality of target characteristic dimensions, the sensitivity of the classification model to the characteristic of the resource consumption abnormal object can be ensured, the positive samples in the unmarked samples are continuously mined in a heuristic manner in the training process for supplement and added into the next iteration, the problem of sample imbalance in the identification scene of the resource consumption abnormal object is effectively solved, and the identification accuracy of the resource consumption abnormal object is improved.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, some nouns that may appear in the embodiments of the present application are explained.

Semi-Supervised Learning (SSL): the method is a key problem in the research in the field of pattern recognition and machine learning, and is a learning method combining supervised learning and unsupervised learning. Semi-supervised learning uses large amounts of unlabeled data, and simultaneously labeled data, to perform pattern recognition operations.

PU Learning (Positive-unfiled Learning): one direction of research in semi-supervised learning refers to training a two-classifier with only positive-class and unlabeled data.

Group renting a house: the house renting method is characterized in that a house is divided and reformed into a plurality of rooms by changing the house structure and the plane layout, and the rooms are respectively rented according to rooms or beds, namely the house renting mode with the population gathered in a small range.

It should be understood that the group rental housing identification method provided by the present application may be applied to a system or a program including a group rental housing identification function in a terminal device, for example, a rental housing application, specifically, the group rental housing identification system may operate in a network architecture as shown in fig. 1, which is a network architecture diagram of the group rental housing identification system, as can be seen from the figure, the group rental housing identification system may provide a group rental housing identification process with a plurality of information sources, that is, send corresponding house information to a server through a triggering operation at a terminal side, and the server performs group rental housing identification based on semi-supervised learning according to resource usage data in a cell corresponding to the house information, so as to obtain a corresponding house type; it is to be understood that fig. 1 illustrates various terminal devices, the terminal devices may be computer devices, in an actual scenario, there may be more or fewer types of terminal devices participating in the process of group tenant identification, and the specific number and types are determined by the actual scenario, which is not limited herein, and in addition, fig. 1 illustrates one server, but in an actual scenario, there may also be participation of multiple servers, and the specific number of servers is determined by the actual scenario.

In this embodiment, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through a wired or wireless communication manner, and the terminal and the server may be connected to form a block chain network, which is not limited herein.

It is understood that the group rental housing identification system may be operated in a personal mobile terminal, for example: the application can be operated in a server as a house renting application, and can also be operated in a third-party device to provide group renting house identification so as to obtain a group renting house identification processing result of an information source; the specific group renting room identification system may be operated in the above-mentioned device in the form of a program, may also be operated as a system component in the above-mentioned device, and may also be used as one of cloud service programs, and the specific operation mode is determined by an actual scene, and is not limited herein.

With the rapid flow of the population, the phenomenon of population in cities is more and more serious. The problems of complex circuit, fire hazard and the like of the group renting house due to house transformation are greatly damaged, and how to identify the group renting house based on limited data becomes a difficult problem.

Therefore, the above problems can be solved by Artificial Intelligence (AI), which is a theory, method, technique and application system that simulates, extends and expands human Intelligence by using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

However, a large amount of data support is required in the machine learning-based group tenant identification process, and due to the limitation of the disclosed data, only a small amount of positive samples and a large amount of unmarked data are often included in the group tenant data, so that the accuracy and effectiveness of group tenant identification are reduced, and the accuracy of group tenant identification is affected.

In order to solve the above problem, the present application provides a method for identifying an abnormal object of resource consumption, which is applied to a flow framework for group tenant identification shown in fig. 2, and as shown in fig. 2, for a flow framework diagram for group tenant identification provided in an embodiment of the present application, a user sends corresponding house information to a server through an interactive operation of a terminal, and then the server invokes resource usage information in a region (cell or unit) corresponding to the house information to perform an identification process based on a semi-supervised framework, so as to obtain a house type prediction result, and perform corresponding feedback to the terminal.

It can be understood that the method provided by the present application may be a program written as a processing logic in a hardware system, and may also be a device for identifying an abnormal object of resource consumption, and the processing logic is implemented in an integrated or external manner. As an implementation mode, the device for identifying the group renters introduces a PU Learning semi-supervised Learning method into a group renter identification scene, based on residential quarter granularity and monthly granularity power consumption data provided by a power grid company, a small amount of positive samples and a large amount of unlabelled samples are randomly combined to construct a training set, effective group renter identification characteristics are extracted, a LightGBM algorithm is used for training a binary model, the positive samples in the unlabelled samples are heuristically mined and added into the next iteration, the problem of sample imbalance in the group renter identification scene is effectively solved, and the accuracy of the group renter identification algorithm is greatly improved. The method can rapidly output highly suspicious group rental residents, facilitates accurate home sampling and investigation of law enforcement personnel, and effectively manages the group rental phenomenon.

The scheme provided by the embodiment of the application relates to an artificial intelligence machine learning technology, and is specifically explained by the following embodiment:

with reference to the above flow architecture, a description will be given below of a group tenant identification method in the present application, please refer to fig. 3, where fig. 3 is a flow chart of an identification method for an abnormal object of resource consumption according to an embodiment of the present application, where the identification method may be executed by a terminal, may be executed by a server, or may be executed by both of them, and the following description will take a terminal execution as an example. The embodiment of the application at least comprises the following steps:

301. and acquiring the resource use data of the target object set as an unmarked sample, and calling the resource use data which is verified as the resource consumption abnormal object to be a positive sample.

In this embodiment, the number of positive samples is less than the number of unlabeled samples; that is, the positive sample is a small amount of data, the unlabeled sample is a large amount of data, and the resource usage data may be data reflecting the daily life expenditure of the household, such as power consumption, water consumption, or network resource consumption.

In addition, the resource consumption abnormal object can be a main body corresponding to resource consumption, such as a tenant user, a user owner, a resource account number and the like; in a possible scenario, since the resource consumption of the group tenant user is different from the resource consumption of the general residents and is likely to cause a potential safety hazard, the resource consumption abnormal object in the embodiment may also be a group tenant user (object).

In the following, the resource consumption abnormal object is taken as a group renter user, and the resource usage data is power consumption, which is taken as an example to describe an embodiment, that is, the data of daily granularity and monthly granularity of residential unit electricity consumption provided by an integrated power grid company and a small number of positive samples of group renting house marks provided by an urban management law enforcement department are collected, and a specific resource consumption abnormal object form and a resource usage data form are determined according to actual scenes, and are used for illustration here.

Specifically, the resource usage data may be obtained by sorting based on a certain time granularity, that is, first determining candidate data corresponding to a target object set within a data statistics range; then dividing the candidate data based on the preset time granularity to obtain granularity data; counting resource characteristic items in the granularity data to obtain resource use data corresponding to the target object set; the resource usage data is then pre-processed to obtain an unlabeled sample. For example: the data statistical range is two years, the preset time granularity includes day and month, the resource feature items include total electric quantity, total electric quantity of valley electricity, total electric quantity of peak electricity, and the like, and then the resource usage data includes day granularity data: the total monthly electric quantity, the total monthly flat electric quantity and the total monthly valley electric quantity are counted by residents according to months in the last two years; and monthly granularity data: the total electricity consumption daily total electricity quantity, daily average electricity total electricity quantity, daily peak electricity total electricity quantity, valley electricity total electricity quantity and daily peak electricity total electricity quantity of residents in the last two years. The specific data division manner depends on the actual scene, and is not limited here.

Optionally, since an abnormal item may exist in the data statistics process, a data preprocessing process may be performed, so that a minimum granularity in the preset time granularities may be determined; traversing the resource use data based on the minimum granularity to determine a vacancy item and a negative value item; and calling a replacement value to replace the null item and the negative item so as to preprocess the resource use data to obtain an unmarked sample. For example, the missing power value and the negative power value in the daily power consumption are uniformly replaced by 0, thereby ensuring the integrity of data.

Optionally, an average number in the resource usage data may also be obtained; then determining the salient items exceeding the average value number in the resource use data; and replacing the numerical value of the salient item with the mean value so as to preprocess the resource use data to obtain an unmarked sample. For example, mean filling is adopted for data with the electric quantity value exceeding 100 times of the mean value of the total electricity consumption of the user, so that the influence of abnormal data on the whole data is avoided.

Optionally, during the preprocessing, a data corresponding relation in the resource usage data may also be determined; then extracting abnormal items in the data corresponding relation; and screening the coincidence numerical values in the abnormal items so as to preprocess the resource use data to obtain an unmarked sample. The reason is that two electric quantity values are stored at one time point due to the fact that the storage format is changed, so that one electric quantity value can be randomly selected as the electric quantity value of the current time point, and the data correspondence is guaranteed.

Meanwhile, marking a corresponding sample in the electricity consumption data of the user as 1 and marking other unmarked samples as 0 in a unified manner based on a small number of group renting house marking positive samples provided by an urban management law enforcement department, and constructing a group renting house identification data set.

In one possible scenario, the set of target objects is a set of users in a cell, or a larger (regional) or smaller (building) range of user groups, and the resource usage data is power usage, the positive sample being sourced from an executable third party platform that is used to supervise group rental housing objects, such as open source data from a government supervision component.

302. And performing feature extraction on the positive sample based on at least one feature item to obtain a target feature dimension.

In this embodiment, the feature item is set based on the resource usage feature of the resource consumption abnormal object at the preset time granularity, that is, the resource usage feature of the group tenant object is set; this is because the resource usage data changes with time, so that the characteristics of the group rented houses are embodied, and the target characteristic dimension is the characteristic dimension for identifying the group rented houses.

Specifically, the target feature dimension is obtained by extracting a series of fine-grained features aiming at the difference of power consumption modes of group renting users and normal users, and feature learning is carried out based on the target feature dimension in the subsequent model training process, so that the sensitivity of the classification model to group renting house data is improved.

Optionally, the target feature dimension open source is a numerical dimension, that is, firstly, a numerical feature corresponding to the feature item is determined; then, feature extraction is carried out on the positive sample based on the numerical features to obtain numerical span information; and determining the target characteristic dimension according to the numerical span information. This is because house improvement is performed in a group rental house, the number of residents in one house increases, and accordingly, the number of used electric appliances also increases, and the amount of electricity used is high. Therefore, the corresponding target feature dimensions are: ranking characteristics of electricity consumption of the users in corresponding cells; using the statistical characteristics of the level/valley value; the statistical characteristics of the power level/valley/total electric quantity used by the user in the last year; the number of the user with the level value less than 50 and the valley value less than 10 every month, and the like, and the specific numerical value composition is determined by the actual scene and is not limited herein.

Optionally, the numerical fluctuation condition may also be determined based on the numerical characteristic, that is, the numerical characteristic within a preset time range is first correlated (for example, 1 year of data) to obtain a fluctuation characteristic; then, extracting features of the positive sample based on the fluctuation features to obtain a feature fluctuation range; and determining the target feature dimension according to the feature fluctuation range. This is because the mobility of the people in group renting a house is strong, and the use of the house has a blank period, so that the fluctuation of the power consumption of the user is relatively large. Therefore, the corresponding target feature dimensions are: the current year and the last year of the user correspond to the month flat/valley/total electric quantity change characteristics; the user fluctuates between adjacent months in the last year (e.g., difference between electricity usage in months 1 and 2, etc.).

Optionally, the group renting house judgment can be performed based on the feature segments in the numerical features, that is, the feature time period corresponding to the numerical features is determined at first; then, extracting features of the positive sample based on the feature time interval to obtain time interval resource use information; and then determining the target characteristic dimension according to the time interval resource use information. This is because the power consumption difference between the average value and the valley value is larger because the probability that the group rental room user is the office worker is larger and the office worker uses less power during the day. The method is characterized in that: the ratio of the electricity usage from 8 am to 5 pm to the electricity usage from 6 pm to 12 pm.

In addition, the characteristic segment in the numerical characteristic may be caused by quitting, that is, a group rental house may have a vacancy period during which the user uses less electricity. The method is characterized in that: the power in the month is the proportion of 0, and the power outlier is the proportion (for example, the power utilization value is less than the value corresponding to the average value minus three times the variance).

Optionally, adjacent numerical features may be compared to obtain a periodic feature; then analyzing the positive sample based on the period characteristics to obtain a characteristic period; and determining the target characteristic dimension according to the characteristic period. The reason is that the ordinary resident electricity consumption has certain periodicity and stability, the similarity characteristic of the electricity value of adjacent months exists, and the group renting house users have the instability of the similarity, so the group renting house can be identified through the periodicity characteristic.

It can be understood that, since the target feature dimension is extracted based on the positive sample determined as the group house, the representativeness of the feature dimension to the group rent house is ensured.

303. And inputting the unlabeled samples and the positive samples into a semi-supervised learning framework to iteratively train a classification model in the semi-supervised learning framework based on the target feature dimension.

In this embodiment, an unlabeled sample meeting a preset condition in an iterative training process is labeled as a supplementary sample, and the supplementary sample is used for updating the positive sample; the iterative training process may also be referred to as a heuristic training process, which is specifically shown in a scenario architecture shown in fig. 4, where fig. 4 is a scenario diagram of another method for identifying a resource consumption abnormal object provided in the embodiment of the present application; the method includes marking the label of an unlabeled user sample as 1, randomly dividing the unlabeled user sample into N parts, performing LightGBM (classification model) training by using the combination of 1 part and a checked user sample as a training set each time, predicting the rest N-1 parts of samples, and adding the predicted positive sample into the positive sample of the training set to perform the next round of training. And analogizing to obtain the prediction probability of all the unverified user samples.

It is understood that 1 copy may be used in multiple copies each time, but in order to avoid the influence of too many negative samples on the recognition result, 1 copy may be selected for iterative labeling, and the specific number depends on the actual scene.

Specifically, for the training process, a training set is generated based on unlabeled samples and positive samples; inputting the training set into a semi-supervised learning framework, and randomly extracting a part of samples from unlabeled samples as negative samples; training a preset model based on the positive sample and the negative sample to obtain a classification model; then, according to the classification model, identifying the non-extracted unlabeled samples to obtain an identification characteristic value corresponding to each sample in the non-extracted unlabeled samples; screening the identification characteristic values based on preset conditions (the characteristic values are larger than a specific value or the characteristic values are in the front of the size sorting), extracting supplementary samples from the unlabeled samples, updating the positive samples based on the supplementary samples, for example, applying a classification model to the unlabeled samples OOB (out of bag) which are not in a training set, recording the scores of the unlabeled samples OOB (out of bag), and adding the samples with higher probability values into the positive samples; and then repeating the process of random extraction, carrying out iterative training on the classification model in the semi-supervised learning frame based on the target characteristic dimension, and finally obtaining N-1 prediction probability values of each unmarked sample suspected to be a group rental house sample.

It can be understood that the classification model adopts a lighter LightGBM model, and since the electricity consumption data of the cell unit users has a time sequence characteristic, a sequence model LSTM and the like can be adopted for secondary classification, so that the change of the electricity consumption rule of the users is captured by automatically extracting features, the process of manually designing statistical features is omitted, and the specific classification model is determined by actual scenes.

304. And acquiring a plurality of identification characteristic values corresponding to unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object in the target object set.

In this embodiment, based on step 303, N-1 predicted probability values of each unlabeled sample suspected to be a tenant sample may be obtained; therefore, mean fusion can be performed based on N-1 results predicted by a classification model (e.g., a LightGBM two-class classification model) in a training process, and a probability that an unlabeled sample is suspected to be a group sample is output, which may specifically adopt the following formula:

wherein, X1 to Xn-1 are N-1 prediction probability values of each unmarked sample suspected to be a tenant sample, and then a prediction mean value is calculated.

Therefore, the group tenant objects in the target object set are judged based on the calculated prediction mean, specifically, unlabeled samples with the prediction mean larger than a preset value (for example, 0.9) are used as the group tenant objects, or the unlabeled samples in the target object set are sorted from large to small according to the prediction mean, and the unlabeled samples sorted in the top 5 bits are used as the group tenant objects, wherein the specific mode is determined according to the actual scene.

With reference to the foregoing embodiment, the resource usage data of the target object set is obtained as an unlabeled sample, and the resource usage data corresponding to the object that has been verified as the resource consumption abnormal object is called as a positive sample, where the number of positive samples is smaller than the number of unlabeled samples; then, extracting features of the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on resource use features under preset time granularity; inputting unlabeled samples and positive samples into a semi-supervised learning frame, and performing iterative training on a classification model in the semi-supervised learning frame based on target characteristic dimensions, wherein the unlabeled samples meeting preset conditions in the iterative training process are labeled as supplementary samples, and the supplementary samples are used for updating the positive samples; and further acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object objects in the target object set. The method has the advantages that the identification process of the resource consumption abnormal object under a small amount of positive samples is realized, the characteristic of the resource consumption abnormal object is extracted by adopting a plurality of target characteristic dimensions, the sensitivity of the classification model to the characteristic of the resource consumption abnormal object can be ensured, the positive samples in the unmarked samples are continuously mined in a heuristic manner in the training process for supplement and added into the next iteration, the problem of sample imbalance in the identification scene of the resource consumption abnormal object is effectively solved, and the identification accuracy of the resource consumption abnormal object is improved.

Next, a description is given of an identification process of a group rented house based on electricity consumption by combining with a module design, as shown in fig. 5, fig. 5 is a flowchart of another identification method of an abnormal resource consumption object provided in the embodiment of the present application, and the embodiment of the present application at least includes the following module execution steps:

501. and a data acquisition module.

In the embodiment, the data acquisition module is used for acquiring the electricity consumption data of the daily granularity and the monthly granularity of the cell unit users provided by a power grid company and collecting a small number of positive samples of the group renting room marks provided by the urban management law enforcement department; the method specifically comprises the following steps of: the total monthly electric quantity, the total monthly flat electric quantity and the total monthly valley electric quantity are counted by residents according to months in the last two years; and monthly granularity data: the total electricity consumption daily total electricity quantity, daily average electricity total electricity quantity, daily peak electricity total electricity quantity, valley electricity total electricity quantity and daily peak electricity total electricity quantity of residents in the last two years.

502. And a data preprocessing module.

In the embodiment, the data preprocessing module is used for processing the abnormal value aiming at the power utilization data of the user, meanwhile, a small number of group renting room marking positive samples are used for marking the data with a label of 1, and the rest unmarked samples are marked with a label of 0 in a unified way; specifically, 0 may be uniformly substituted for the missing power value and the negative-going power value; mean filling is adopted for data with the electric quantity value exceeding 100 times of the mean value of the total electricity consumption of the user; for a time point that two electric quantity values are stored due to the change of the storage format, one electric quantity value is randomly selected as the electric quantity value of the current time point.

503. And (5) a feature engineering module.

In the embodiment, the feature engineering module is used for extracting a series of fine-grained features from a group renting scene according to the difference of group renting users and normal users in the power utilization mode; specific feature dimensions are shown in the description of the embodiment shown in fig. 3, and are not described herein again.

504. And a model training module.

In this embodiment, the model training module is configured to train a LightGBM binary model based on a PU Learning training frame by using the constructed data set; firstly, randomly combining all positive samples and unlabeled samples to create a training set; then, constructing a classifier by using a "bootstrap" sample, and respectively considering a positive sample and an unlabeled sample as positive and negative; applying the classifier to unlabeled samples OOB (out of bag) which are not in the training set, recording the scores of the unlabeled samples OOB (out of bag), and adding the samples with higher probability values into the positive samples; and repeating the three steps to finally obtain N-1 predicted probability values of each unmarked sample suspected to be a group rental room sample.

505. And a result output module.

In this embodiment, the output result module is configured to fuse the results predicted by the LightGBM two-class model in the training process, and output the probability that the unlabeled sample is suspected to be the group sample. Namely, mean value fusion is carried out on N-1 results predicted by the LightGBM binary classification model in the training process, and the probability that the unlabeled sample is suspected to be the group sample is output.

In the embodiment, a PU Learning semi-supervised Learning method is introduced into a group renting house identification scene, a small number of positive samples and a large number of unlabelled samples are randomly combined to construct a training set based on the daily granularity and monthly granularity power consumption data of residential communities provided by a power grid company, effective group renting house identification characteristics are extracted, a light GBM algorithm is used for training a binary model, the positive samples in the unlabelled samples are heuristically mined and added into the next iteration, the problem of sample imbalance in the group renting house identification scene is effectively solved, and the precision of the group renting house identification algorithm is greatly improved. The method can rapidly output highly suspicious group rental residents, facilitates accurate home sampling and investigation of law enforcement personnel, and effectively manages the group rental phenomenon.

The above embodiment describes a group rental housing identification process based on electricity consumption, but in an actual scenario, multidimensional identification can be performed according to various resource usage data, and the scenario is described below. Referring to fig. 6, fig. 6 is a flowchart of another method for identifying an abnormal object of resource consumption according to an embodiment of the present application, where the embodiment of the present application at least includes the following steps:

601. and determining the special data of the resource use data corresponding to the target object set under different data dimensions.

In the embodiment, the special data under different data dimensions are classified, that is, the data of different types do not interfere with each other, so that the accuracy of data processing is ensured.

602. Electricity usage data.

In this embodiment, the electricity consumption data is electricity quantity data of each user account, and may specifically be statistical data at a granularity of every day, every month, or other time.

603. Water usage data.

In this embodiment, the water consumption data is water volume data in each user account, and may specifically be statistical data at a granularity of every day, every month, or at other time.

604. Network consumption data.

In this embodiment, the network consumption data is network resource consumption data under each broadband account, that is, statistics is performed from a network route, so that data separation between different tenants is avoided.

605. And acquiring a first predicted value.

In this embodiment, the first predicted value is a predicted value obtained by performing the identification method in the embodiment shown in fig. 3 with the power consumption data as the resource usage data.

606. And acquiring a second predicted value.

In this embodiment, the second predicted value is a predicted value obtained by performing the identification method of the embodiment shown in fig. 3 using the water consumption data as the resource usage data.

607. And acquiring a third predicted value.

In this embodiment, the third predicted value is a predicted value obtained by performing the identification method of the embodiment shown in fig. 3 with the network consumption data as the resource usage data.

608. A weighting calculation is performed to determine a group tenant object.

In this embodiment, since the association degree between the electricity, water and network consumption processes may be different from the association degree between the group renting rooms, different weight values may be set to calculate the final value, for example, the electricity consumption: water: and performing weighted calculation on the network consumption of 0.5:0.3:0.2, thereby ensuring the accuracy of group tenant object identification.

The instant reminding function of the house renting software in the terminal is explained below. Referring to fig. 7, fig. 7 is a flowchart of another method for identifying an abnormal object of resource consumption according to an embodiment of the present application, where the embodiment of the present application at least includes the following steps:

701. and determining the target house source in response to the triggering operation on the house renting interface.

In this embodiment, the triggering operation on the renting interface is a clicking operation on the house source, which may be specifically in a process of viewing details or a process of telephone contact.

702. And calling the resource use information in the corresponding area based on the target house source so as to identify the group renting house.

In this embodiment, for the house resources in the area where the group renting is not marked, since the identification mode of semi-supervised learning is adopted, the identification process of the embodiment shown in fig. 3 needs to be performed on the resource usage information in the area corresponding to the target house resource, which is not described herein again. And for the house sources in the area where the group rents houses are marked, directly traversing based on the marks to obtain the identification result.

703. And displaying prompt elements of the target house source based on the house renting interface.

In this embodiment, as shown in fig. 8, fig. 8 is a scene schematic diagram of another method for identifying an abnormal object of resource consumption according to the embodiment of the present application; the diagram shows that after clicking a house source, a user triggers a server to call resource use data and perform group house renting identification, so that a prediction result is obtained, and if the prediction result is a group house renting, a prompt element A1 of a target house source is displayed in a house renting interface, that is, the user is prompted about possible group house renting of the house source, and on-site inspection is required, so that the reliability of house renting software is ensured.

In order to better implement the above-mentioned aspects of the embodiments of the present application, the following also provides related apparatuses for implementing the above-mentioned aspects. Referring to fig. 9, fig. 9 is a schematic structural diagram of an apparatus for identifying an abnormal object of resource consumption according to an embodiment of the present application, where the apparatus 900 includes:

an obtaining unit 901, configured to obtain resource usage data of a target object set as an unmarked sample, and call resource usage data verified as a resource consumption abnormal object as a positive sample, where the number of the positive samples is smaller than the number of the unmarked samples;

an extracting unit 902, configured to perform feature extraction on the positive sample based on at least one feature item to obtain a target feature dimension, where the feature item is set based on resource usage features at a preset time granularity;

a training unit 903, configured to input the unlabeled samples and the positive samples into a semi-supervised learning framework, so as to perform iterative training on a classification model in the semi-supervised learning framework based on the target feature dimension, where unlabeled samples meeting a preset condition in the iterative training process are labeled as complementary samples, and the complementary samples are used for updating the positive samples;

an identifying unit 904, configured to obtain a plurality of identifying feature values corresponding to the unlabeled samples output by the classification model in the iterative training process, and perform fusion based on the identifying feature values to determine a resource consumption abnormal object in the target object set.

Optionally, in some possible implementation manners of the present application, the obtaining unit 901 is specifically configured to determine candidate data corresponding to the target object set within a data statistics range;

the obtaining unit 901 is specifically configured to divide the candidate data based on the preset time granularity to obtain granularity data;

the obtaining unit 901 is specifically configured to count resource feature items in the granularity data to obtain resource usage data corresponding to the target object set;

the obtaining unit 901 is specifically configured to perform preprocessing on the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit 901 is specifically configured to determine a minimum granularity in the preset time granularities;

the obtaining unit 901 is specifically configured to traverse the resource usage data based on the minimum granularity to determine a vacancy item and a negative value item;

the obtaining unit 901 is specifically configured to invoke a replacement value to replace the vacancy item and the negative value item, so as to preprocess the resource usage data to obtain the unmarked sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit 901 is specifically configured to obtain an average value number in the resource usage data;

the obtaining unit 901 is specifically configured to determine a salient item exceeding the average number in the resource usage data;

the obtaining unit 901 is specifically configured to replace the numerical value of the protruding item with the mean value, so as to preprocess the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementation manners of the present application, the obtaining unit 901 is specifically configured to determine a data corresponding relationship in the resource usage data;

the obtaining unit 901 is specifically configured to extract an abnormal item in the data correspondence;

the obtaining unit 901 is specifically configured to screen a coincidence value in the abnormal item, so as to preprocess the resource usage data to obtain the unlabeled sample.

Optionally, in some possible implementations of the present application, the extracting unit 902 is specifically configured to determine a numerical feature corresponding to the feature item;

the extracting unit 902 is specifically configured to perform feature extraction on the positive sample based on the numerical features to obtain numerical span information;

the extracting unit 902 is specifically configured to determine the target feature dimension according to the numerical span information.

Optionally, in some possible implementation manners of the present application, the extracting unit 902 is specifically configured to associate the numerical features within a preset time range to obtain a fluctuation feature;

the extracting unit 902 is specifically configured to perform feature extraction on the positive sample based on the fluctuation feature to obtain a feature fluctuation range;

the extracting unit 902 is specifically configured to determine the target feature dimension according to the feature fluctuation range.

Optionally, in some possible implementations of the present application, the extracting unit 902 is specifically configured to determine a feature time period corresponding to the numerical feature;

the extracting unit 902 is specifically configured to perform feature extraction on the positive sample based on the feature time interval to obtain time interval resource usage information;

the extracting unit 902 is specifically configured to determine the target feature dimension according to the time interval resource usage information.

Optionally, in some possible implementation manners of the present application, the extracting unit 902 is specifically configured to compare adjacent numerical features to obtain a periodic feature;

the extracting unit 902 is specifically configured to analyze the positive sample based on the period feature to obtain a feature period;

the extracting unit 902 is specifically configured to determine the target feature dimension according to the feature period.

Optionally, in some possible implementations of the present application, the training unit 903 is specifically configured to generate a training set based on the unlabeled sample and the positive sample;

the training unit 903 is specifically configured to input the training set into the semi-supervised learning framework, and randomly extract a part of samples from the unlabeled samples as negative samples;

the training unit 903 is specifically configured to train a preset model based on the positive sample and the negative sample to obtain the classification model;

the training unit 903 is specifically configured to identify an unextracted unlabeled sample according to the classification model to obtain an identification feature value corresponding to each sample in the unextracted unlabeled sample;

the training unit 903 is specifically configured to screen the identification feature value based on the preset condition, to extract a supplementary sample from the unlabeled sample, and to update the positive sample based on the supplementary sample;

the training unit 903 is specifically configured to repeat the process of random extraction, so as to perform the iterative training on the classification model in the semi-supervised learning framework based on the target feature dimension.

Optionally, in some possible implementation manners of the present application, the identifying unit 904 is specifically configured to determine special data of the resource usage data corresponding to the target object set in different data dimensions;

the identification unit 904 is specifically configured to obtain a plurality of predicted values corresponding to the special data respectively;

the identifying unit 904 is specifically configured to perform weighted calculation on the plurality of predicted values to obtain a target feature value;

the identifying unit 904 is specifically configured to determine, based on the target feature value, a resource consumption abnormal object in the target object set.

An embodiment of the present application further provides a terminal device, as shown in fig. 10, which is a schematic structural diagram of another terminal device provided in the embodiment of the present application, and for convenience of description, only a portion related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to a method portion in the embodiment of the present application. The terminal may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a point of sale (POS), a vehicle-mounted computer, and the like, taking the terminal as the mobile phone as an example:

fig. 10 is a block diagram illustrating a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application. Referring to fig. 10, the cellular phone includes: radio Frequency (RF) circuitry 1010, memory 1020, input unit 1030, display unit 1040, sensor 1050, audio circuitry 1060, wireless fidelity (WiFi) module 1070, processor 1080, and power source 1090. Those skilled in the art will appreciate that the handset configuration shown in fig. 10 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 10:

RF circuit 1010 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, for processing downlink information of a base station after receiving the downlink information to processor 1080; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1010 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1010 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), etc.

The memory 1020 can be used for storing software programs and modules, and the processor 1080 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1020. The memory 1020 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1020 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 1030 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1030 may include a touch panel 1031 and other input devices 1032. The touch panel 1031, also referred to as a touch screen, may collect touch operations by a user (e.g., operations by a user on or near the touch panel 1031 using any suitable object or accessory such as a finger, a stylus, etc., and spaced touch operations within a certain range on the touch panel 1031) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1031 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1080, and can receive and execute commands sent by the processor 1080. In addition, the touch panel 1031 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1030 may include other input devices 1032 in addition to the touch panel 1031. In particular, other input devices 1032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, a joystick, or the like.

The display unit 1040 may be used to display information input by a user or information provided to the user and various menus of the cellular phone. The display unit 1040 may include a display panel 1041, and optionally, the display panel 1041 may be configured in the form of a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 1031 can cover the display panel 1041, and when the touch panel 1031 detects a touch operation on or near the touch panel 1031, the touch operation is transmitted to the processor 1080 to determine the type of the touch event, and then the processor 1080 provides a corresponding visual output on the display panel 1041 according to the type of the touch event. Although in fig. 10, the touch panel 1031 and the display panel 1041 are two separate components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1031 and the display panel 1041 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 1050, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1041 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1041 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1060, speaker 1061, microphone 1062 may provide an audio interface between the user and the handset. The audio circuit 1060 can transmit the electrical signal converted from the received audio data to the speaker 1061, and the electrical signal is converted into a sound signal by the speaker 1061 and output; on the other hand, the microphone 1062 converts the collected sound signal into an electrical signal, which is received by the audio circuit 1060 and converted into audio data, which is then processed by the audio data output processor 1080 and then sent to, for example, another cellular phone via the RF circuit 1010, or output to the memory 1020 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help the user to send and receive e-mail, browse web pages, access streaming media, etc. through the WiFi module 1070, which provides wireless broadband internet access for the user. Although fig. 10 shows the WiFi module 1070, it is understood that it does not belong to the essential constitution of the handset, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1080 is a control center of the mobile phone, connects various parts of the whole mobile phone by using various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1020 and calling data stored in the memory 1020, thereby performing overall detection of the mobile phone. Optionally, processor 1080 may include one or more processing units; optionally, processor 1080 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, etc., and a modem processor, which primarily handles wireless communications. It is to be appreciated that the modem processor described above may not be integrated into processor 1080.

The handset also includes a power source 1090 (e.g., a battery) for powering the various components, which may optionally be logically coupled to the processor 1080 via a power management system to manage charging, discharging, and power consumption via the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.

In the embodiment of the present application, the processor 1080 included in the terminal further has a function of executing the steps of the page processing method.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a server provided in the embodiment of the present application, where the server 1100 may generate large differences due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1122 (e.g., one or more processors) and a memory 1132, and one or more storage media 1130 (e.g., one or more mass storage devices) storing an application program 1142 or data 1144. Memory 1132 and storage media 1130 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 1122 may be provided in communication with the storage medium 1130 to execute a sequence of instruction operations in the storage medium 1130 on the server 1100.

The server 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input-output interfaces 1158, and/or one or more operating systems 1141, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

The steps performed by the management apparatus in the above-described embodiment may be based on the server configuration shown in fig. 11.

An embodiment of the present application further provides a computer-readable storage medium, in which a group tenant identification instruction is stored, and when the instruction is executed on a computer, the computer is caused to perform the steps performed by the group tenant identification apparatus in the method described in the foregoing embodiments shown in fig. 3 to 8.

Also provided in the embodiments of the present application is a computer program product including instructions for identifying a group tenant, which when run on a computer, causes the computer to perform the steps performed by the group tenant identifying apparatus in the method described in the embodiments of fig. 3 to 8.

The embodiment of the present application further provides a group tenant identification system, where the group tenant identification system may include the group tenant identification device in the embodiment described in fig. 9, the terminal device in the embodiment described in fig. 10, or the server described in fig. 11.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a group rental housing identification apparatus, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for identifying resource consumption abnormality is characterized by comprising the following steps:

acquiring resource use data of a target object set as an unmarked sample, and calling the resource use data which is verified as a resource consumption abnormal object as a positive sample, wherein the number of the positive samples is less than that of the unmarked samples;

performing feature extraction on the positive sample based on at least one feature item to obtain a target feature dimension, wherein the feature item is set based on the resource use feature of the resource consumption abnormal object under a preset time granularity;

and acquiring a plurality of identification characteristic values corresponding to the unlabeled samples output by the classification model in the iterative training process, and fusing based on the identification characteristic values to determine the resource consumption abnormal object in the target object set.

2. The method of claim 1, wherein obtaining resource usage data for a set of target objects as unlabeled samples comprises:

pre-processing the resource usage data to obtain the unlabeled sample.

3. The method of claim 2, wherein the pre-processing the resource usage data to obtain the unlabeled sample comprises:

determining the minimum granularity in the preset time granularities;

4. The method of claim 2, wherein the pre-processing the resource usage data to obtain the unlabeled sample comprises:

acquiring the average value number in the resource use data;

5. The method of claim 2, wherein the pre-processing the resource usage data to obtain the unlabeled sample comprises:

determining a data corresponding relation in the resource use data;

extracting abnormal items in the data corresponding relation;

6. The method of claim 1, wherein the feature extracting the positive sample based on at least one feature term to obtain a target feature dimension comprises:

7. The method of claim 6, further comprising:

8. The method of claim 6, further comprising:

9. The method of claim 6, further comprising:

and determining the target feature dimension according to the feature period.

10. The method of claim 1, wherein the inputting the unlabeled samples and the positive samples into a semi-supervised learning framework for iterative training of a classification model in the semi-supervised learning framework based on the target feature dimensions comprises:

11. The method according to any one of claims 1-10, further comprising:

determining resource consumption exception objects in the target object set based on the target feature values.

12. The method of claim 1, wherein the resource consumption exception object is a group tenant object, the set of target objects is a set of cell users, the resource usage data is power usage, and the positive sample is sourced from an executable third party platform that is used to supervise the group tenant object.

13. An apparatus for identifying an abnormal object of resource consumption, comprising:

14. A computer device, the computer device comprising a processor and a memory:

the memory is used for storing program codes; the processor is configured to execute the method for identifying a resource consumption exception object according to any one of claims 1 to 12 in accordance with instructions in the program code.

15. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to execute the method for identifying a resource consumption abnormality object according to any one of the above claims 1 to 12.