CN109086377B - Equipment portrait generation method and device and computing equipment - Google Patents

Equipment portrait generation method and device and computing equipment Download PDF

Info

Publication number
CN109086377B
CN109086377B CN201810818733.4A CN201810818733A CN109086377B CN 109086377 B CN109086377 B CN 109086377B CN 201810818733 A CN201810818733 A CN 201810818733A CN 109086377 B CN109086377 B CN 109086377B
Authority
CN
China
Prior art keywords
information
equipment
predicted
dimension
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810818733.4A
Other languages
Chinese (zh)
Other versions
CN109086377A (en
Inventor
汪德嘉
叶芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing tongfudun Artificial Intelligence Technology Co., Ltd
JIANGSU PAY EGIS TECHNOLOGY Co.,Ltd.
Original Assignee
Beijing Tongfudun Artificial Intelligence Technology Co Ltd
Jiangsu Pay Egis Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tongfudun Artificial Intelligence Technology Co Ltd, Jiangsu Pay Egis Technology Co ltd filed Critical Beijing Tongfudun Artificial Intelligence Technology Co Ltd
Priority to CN201810818733.4A priority Critical patent/CN109086377B/en
Publication of CN109086377A publication Critical patent/CN109086377A/en
Application granted granted Critical
Publication of CN109086377B publication Critical patent/CN109086377B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for generating an equipment portrait and computing equipment. The method comprises the following steps: extracting characteristic information of equipment to be predicted; analyzing the feature information of the equipment to be predicted from multiple dimensions respectively to obtain feature analysis results of the dimensions; obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension; and generating a device portrait of the device to be predicted according to the label information of each dimension of the device to be predicted. The method makes full use of massive characteristic information of the equipment to be predicted, analyzes the characteristic information from multiple dimensions, and performs drawing of multiple dimensions on the equipment to be predicted, so that the generated equipment portrait is more comprehensive and accurate.

Description

Equipment portrait generation method and device and computing equipment
Technical Field
The invention relates to the technical field of user portrait, in particular to a method and a device for generating equipment portrait and computing equipment.
Background
The user portrait is characterized in that the attribute of the user characteristic is described through all dimensions, the characteristics are analyzed and counted, potential value information is mined, the information overview of a user is abstracted, and the user portrait is widely applied to all fields due to the potential value of the user portrait. At present, various mobile devices such as smart phones, tablet computers, notebook computers and the like are popularized, each device generates massive data every day, and various feature data of the device are extracted and subjected to multi-dimensional analysis, so that a user using the device is indirectly subjected to feature description, and a user portrait obtaining mode is an effective method capable of generating the user portrait.
Disclosure of Invention
In view of the above, the present invention has been made to provide a method, apparatus and computing device for generating a device representation that overcomes or at least partially solves the above mentioned problems.
According to an aspect of the invention, there is provided a method of generating a device representation, the method comprising:
extracting characteristic information of equipment to be predicted;
analyzing the feature information of the equipment to be predicted from multiple dimensions respectively to obtain feature analysis results of the dimensions;
obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension;
and generating a device portrait of the device to be predicted according to the label information of each dimension of the device to be predicted.
Optionally, the feature information of the device to be predicted includes: self characteristic information of the equipment to be predicted;
analyzing the feature information of the device to be predicted from multiple dimensions, respectively, and obtaining feature analysis results of each dimension further comprises:
analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions;
obtaining label information of each dimension of the device to be predicted according to the feature analysis result of each dimension further comprises:
and obtaining first label information of each dimension of the equipment to be predicted according to the own characteristic analysis result of each dimension.
Optionally, the feature information of the device to be predicted further includes: associated feature information between devices;
analyzing the feature information of the device to be predicted from multiple dimensions respectively to obtain feature analysis results of each dimension, and obtaining label information of each dimension of the device to be predicted according to the feature analysis results of each dimension further comprises: according to the associated feature information between the devices, searching for similar devices of the devices to be predicted through the first machine learning model;
obtaining second label information of the equipment to be predicted according to the label information of the similar equipment;
generating an equipment portrait of the equipment to be predicted according to the label information of each dimension of the equipment to be predicted specifically as follows: and generating a device portrait of the device to be predicted according to the first label information and the second label information of each dimension of the device to be predicted.
Optionally, the multiple dimensions specifically include:
a device geolocation dimension, an application installation situation dimension, a device behavior dimension, and/or a device security dimension;
analyzing the own feature information of the device to be predicted from multiple dimensions respectively, and obtaining own feature analysis results of the dimensions specifically comprises:
analyzing the information of the IP address used by the equipment to be predicted, the information of the wireless network connected with the equipment to be predicted and the information of the base station connected with the equipment to be predicted to obtain the self-characteristic analysis result of the geographical position dimension;
and/or analyzing the application of the equipment installation to be predicted to obtain the own characteristic analysis result of the dimensionality of the application installation condition;
and/or analyzing the hardware parameter information of the equipment to be predicted to obtain a self-characteristic analysis result of the security dimension;
and/or analyzing whether the to-be-predicted equipment is provided with the information of the application which does not accord with the preset safety condition, the starting time information of the to-be-predicted equipment, the information whether the to-be-predicted equipment breaks the prison or not and/or the information whether the to-be-predicted equipment is provided with the simulator or not through a second machine learning model to obtain the own characteristic analysis result of the equipment behavior dimension.
Optionally, analyzing the hardware parameter information of the device to be predicted to obtain the own feature analysis result of the security dimension further includes:
and respectively analyzing IMEI information, MAC address information, state information of an SIM card, electric quantity information, residual disk space information, total disk space information, resolution information and/or information of whether the brand and the model of the equipment to be predicted are matched, and obtaining a self-characteristic analysis result of whether hardware parameters of the equipment to be predicted are forged or not.
Optionally, the associated characteristic information between the devices includes one or more of the following information:
the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment.
Optionally, the method further comprises:
whether the behavior of the sample equipment is suspicious is marked according to the own characteristic information of the equipment behavior dimension of each sample equipment, and an equipment behavior marking result is obtained;
and training the second machine learning model according to the own characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result.
Optionally, the method further comprises:
whether the two sample devices are associated or not is marked according to the associated characteristic information between every two sample devices, and an associated device marking result is obtained;
and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result.
According to another aspect of the present invention, there is provided an apparatus for generating a device representation, the apparatus comprising:
the characteristic extraction module is suitable for extracting the characteristic information of the equipment to be predicted;
the analysis module is suitable for analyzing the characteristic information of the equipment to be predicted from a plurality of dimensions respectively to obtain characteristic analysis results of the dimensions;
the label information determining module is suitable for obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension;
and the equipment portrait generation module is suitable for generating the equipment portrait of the equipment to be predicted according to the label information of each dimension of the equipment to be predicted.
Optionally, the feature information of the device to be predicted includes: self characteristic information of the equipment to be predicted;
the analysis module is further adapted to:
analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions;
the tag information determination module is further adapted to: and obtaining first label information of each dimension of the equipment to be predicted according to the own characteristic analysis result of each dimension.
Optionally, the feature information of the device to be predicted further includes: associated feature information between devices;
the analysis module is further adapted to: according to the associated feature information between the devices, searching for similar devices of the devices to be predicted through the first machine learning model;
the tag information determination module is further adapted to: obtaining second label information of the equipment to be predicted according to the label information of the similar equipment;
the device representation generation module is further adapted to: and generating a device portrait of the device to be predicted according to the first label information and the second label information of each dimension of the device to be predicted.
Optionally, the multiple dimensions specifically include:
a device geolocation dimension, an application installation situation dimension, a device behavior dimension, and/or a device security dimension;
the analysis module is further adapted to:
analyzing the information of the IP address used by the equipment to be predicted, the information of the wireless network connected with the equipment to be predicted and the information of the base station connected with the equipment to be predicted to obtain the self-characteristic analysis result of the geographical position dimension;
and/or analyzing the application of the equipment installation to be predicted to obtain the own characteristic analysis result of the dimensionality of the application installation condition;
and/or analyzing the hardware parameter information of the equipment to be predicted to obtain a self-characteristic analysis result of the security dimension;
and/or analyzing whether the to-be-predicted equipment is provided with the information of the application which does not accord with the preset safety condition, the starting time information of the to-be-predicted equipment, the information whether the to-be-predicted equipment breaks the prison or not and/or the information whether the to-be-predicted equipment is provided with the simulator or not through a second machine learning model to obtain the own characteristic analysis result of the equipment behavior dimension.
Optionally, the analysis module is further adapted to:
and respectively analyzing IMEI information, MAC address information, state information of an SIM card, electric quantity information, residual disk space information, total disk space information, resolution information and/or information of whether the brand and the model of the equipment to be predicted are matched, and obtaining a self-characteristic analysis result of whether hardware parameters of the equipment to be predicted are forged or not.
Optionally, the associated characteristic information between the devices includes one or more of the following information:
the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment.
Optionally, the apparatus further comprises:
the model training module is suitable for marking whether the behavior of each sample device is suspicious according to the self-characteristic information of the device behavior dimension of each sample device to obtain a device behavior marking result;
and training the second machine learning model according to the own characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result.
Optionally, the model training module is further adapted to:
whether the two sample devices are associated or not is marked according to the associated characteristic information between every two sample devices, and an associated device marking result is obtained;
and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result.
The invention provides a device portrait generation method and device and a computing device. The method comprises the following steps: extracting characteristic information of equipment to be predicted; analyzing the feature information of the equipment to be predicted from multiple dimensions respectively to obtain feature analysis results of the dimensions; obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension; and generating a device portrait of the device to be predicted according to the label information of each dimension of the device to be predicted. The method makes full use of massive characteristic information of the equipment to be predicted, analyzes the characteristic information from multiple dimensions, and performs drawing of multiple dimensions on the equipment to be predicted, so that the generated equipment portrait is more comprehensive and accurate.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 illustrates a flow diagram of a device representation generation method in accordance with one embodiment of the present invention;
FIG. 2 illustrates a flow diagram of a device representation generation method in accordance with another embodiment of the invention;
FIG. 3 illustrates a functional block diagram of a device representation generation apparatus in accordance with yet another embodiment of the invention;
FIG. 4 shows a schematic structural diagram of a computing device according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a flow diagram of a device representation generation method according to an embodiment of the invention, as shown in FIG. 1, the method comprising:
and step S101, extracting characteristic information of the equipment to be predicted.
In this embodiment, generating a device representation of a device to be predicted may also be understood as generating a user representation of a user using the prediction device.
The characteristic information of the equipment to be predicted relates to multiple aspects, and one characteristic of a user using the equipment to be predicted can be extracted by analyzing each kind of characteristic information of the equipment to be predicted. For example, by analyzing the application installation information of the device to be predicted and determining that the device to be predicted is provided with a plurality of mobile games, it can be determined that the user using the device to be predicted is the game player. Therefore, the method of the embodiment uses the characteristic information of the device to be predicted to portray the device to be predicted.
The invention does not limit the specific content of the extracted characteristic information of the equipment to be predicted, and the technical personnel in the field can set the characteristic information according to the actual requirement.
Step S102, analyzing the feature information of the equipment to be predicted from multiple dimensions respectively to obtain feature analysis results of each dimension.
The extracted feature information of the device to be predicted is analyzed from multiple dimensions, for example, the feature information of the device to be predicted is analyzed from multiple dimensions, such as a device geographical position dimension, an application installation condition dimension, a device behavior dimension, a device security dimension, and/or an association dimension, respectively. It should be noted that characteristic information to be analyzed often differs between different dimensions, and technical means for analyzing corresponding characteristic information from different dimensions may also differ. For example, a feature analysis result of a geographical position dimension is obtained by analyzing feature information capable of representing the geographical position of the device to be predicted; for another example, the machine learning model is used to analyze the feature information that can represent whether the device to be predicted is similar to other devices, so as to obtain the feature analysis result of whether similar devices similar to the device to be predicted exist.
In the present invention, the characteristic information of the device to be predicted is related to various aspects, and the dimension of analysis of the characteristic information of the device to be predicted is also related to various aspects, which is not particularly limited by the present invention.
And step S103, obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension.
And labeling the device to be predicted from each dimension according to the characteristic analysis result of each dimension. For example, from the viewpoint of analyzing the geographical position of the device, the feature information of the device to be predicted is analyzed, and the device to be predicted is located in a certain office building area on monday to friday every week, so that it can be inferred that the user of the device to be predicted is a working family, and the device to be predicted is labeled with a working family.
And step S104, generating an equipment portrait of the equipment to be predicted according to the label information of each dimension of the equipment to be predicted.
And integrating the label information of each dimension of the equipment to be predicted, and taking the label information of each dimension as an equipment portrait of the equipment to be predicted.
According to the method for generating the equipment portrait provided by the embodiment, firstly, the characteristic information of the equipment to be predicted is extracted; secondly, analyzing the feature information of the equipment to be predicted from multiple dimensions respectively to obtain feature analysis results of the dimensions; then, obtaining label information of each dimension of the equipment to be predicted according to the feature analysis result of each dimension; and generating a device portrait of the device to be predicted according to the label information of each dimension of the device to be predicted. The method makes full use of massive characteristic information of the equipment to be predicted, analyzes the characteristic information from multiple dimensions, and performs drawing of multiple dimensions on the equipment to be predicted, so that the generated equipment portrait is more comprehensive and accurate.
FIG. 2 is a flow diagram illustrating a method for generating a device representation, according to an embodiment of the invention, as shown in FIG. 2, the method comprising:
step S201, extracting the self characteristic information of the devices to be predicted and the associated characteristic information between the devices.
In this embodiment, generating a device representation of a device to be predicted may also be understood as generating a user representation of a user using the prediction device.
The characteristic information of the device to be predicted comprises one or more of the following information: the method comprises the steps of analyzing information of an IP address used by a device to be predicted, information of a wireless network connected with the device to be predicted, information of a base station connected with the device to be predicted, an application installed on the device to be predicted, hardware parameter information of the device to be predicted, information of whether the device to be predicted is provided with the application which does not meet preset safety conditions, starting time information of the device to be predicted, information of whether the device to be predicted is out of the prison and/or information of whether the simulator is installed on the device to be predicted. The applications which do not meet the preset safety condition comprise some camouflage type applications, such as a WiFi connection camouflage application, a model camouflage application, a camouflage master application and the like.
The associated characteristic information between the devices comprises one or more of the following information: the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment. The information indicating whether the application installed on the device meets the preset application similarity condition may specifically refer to: the distribution of the types of applications installed by the different devices is the same or similar, for example, the ratio of the number of shopping-type applications installed by device a to the total number of applications installed by device a is 20%, and the ratio of the number of shopping-type applications installed by device B to the total number of applications installed by device B is 20.5%. The applications installed in the device a and the device B are considered to be similar to the preset application, and the invention is not limited thereto. The behavior data of the device may refer to online behavior data of the device, such as payment behavior data, behavior data of accessing a webpage, and the like, and whether the behavior data of the device meets the preset behavior data similarity condition may specifically refer to: the difference between the generation times of the same behavior data of different devices is smaller than a preset time threshold, or the time distribution of the different devices accessing the same type of web page is similar, and the like, which is not limited to this, of course.
Step S202, analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions.
Wherein the plurality of dimensions specifically include: a device geolocation dimension, an application installation dimension, a device behavior dimension, and/or a device security dimension. The feature information to be analyzed often differs from dimension to dimension, and the technical means for analyzing the corresponding feature information from different dimensions may also differ.
Specifically, the geographical position of the device to be predicted is determined through information of an IP address used by the device to be predicted, information of a wireless network connected with the device to be predicted and/or information of a base station connected with the device to be predicted, and a self-characteristic analysis result of a geographical position dimension is obtained.
And analyzing the application installed on the equipment to be predicted, determining the type and the number of the application installed on the equipment to be predicted and the industry field to which the application belongs, and obtaining the self-characteristic analysis result of the dimensionality of the application installation condition.
And determining whether the equipment to be predicted is safe or not by analyzing the hardware parameter information of the equipment to be predicted, and obtaining the own characteristic analysis result of the safety dimensionality. Specifically, the IMEI information, the MAC address information, the state information of the SIM card, the electric quantity information, the residual disk space information, the total disk space information, the resolution information, and/or the information of whether the brand and the model of the device to be predicted are/is analyzed, whether the hardware parameter of the device to be predicted is forged or not is determined, and a self-feature analysis result of whether the hardware parameter of the device to be predicted is forged or not is obtained.
Whether the equipment to be predicted has suspicious behaviors or not is judged through information on whether the equipment to be predicted is provided with an application which does not accord with a preset safety condition or not, starting time information of the equipment to be predicted, information on whether the equipment to be predicted breaks over a prison or not and/or information on whether the equipment to be predicted is provided with a simulator or not, and a self-characteristic analysis result of equipment behavior dimensionality is obtained.
In addition, since the suspicious behavior of the device may change continuously with the passage of time, in this embodiment, the second machine learning model is used to analyze whether the behavior of the device to be predicted is suspicious. Training the first machine learning model by utilizing big data, specifically, marking whether the behavior of each sample device is suspicious according to the own characteristic information of the device behavior dimension of the sample device to obtain a device behavior marking result; and training the second machine learning model according to the own characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result. And aiming at each sample device, the self-characteristic information of the device behavior dimension is used as a piece of training data, whether the sample device is suspicious is marked according to the piece of training data, and the second machine learning model is further trained according to the training data and the marking result, so that the parameters of the second machine learning model are continuously optimized, and the accuracy of the prediction result of the second machine learning model is improved.
Wherein the second machine learning model may be: the support vector machine, the decision tree, the neural network, the random forest, the XGboost, the gradient lifting tree, or the fusion of the above models, which is not limited in the present invention.
Step S203, obtaining first label information of each dimension of the equipment to be predicted according to the own feature analysis result of each dimension.
The self-feature analysis result of each dimension can reflect the identity, habit, behavior and other features of the user using the device to be predicted, so that the device to be predicted (the user using the device to be predicted) can be labeled according to the self-feature analysis result of each dimension.
For example, according to the own feature analysis result of the geographic location dimension, the distribution of the geographic location of the device to be predicted may be determined, for example, if the geographic location of the device to be predicted within one day is obtained through analysis and distributed in a certain school area and a certain high-end cell, the first tag information added to the device to be predicted according to the own feature analysis result includes: students, high-end community users.
According to the own characteristic analysis result of the application installation dimension, the type and the number of the applications installed on the device to be predicted and the industry field to which the applications belong can be determined, for example, if the device to be predicted is installed with a plurality of game applications after analysis, the first label information added to the device to be predicted according to the own characteristic analysis result comprises: the game reaches the player.
According to the self-characteristic analysis result of the safety dimensionality of the equipment, whether the hardware parameter of the equipment to be predicted is forged or not can be determined, and according to the self-characteristic analysis result of the behavior dimensionality of the equipment, whether the behavior of the equipment to be predicted is suspicious or not can be determined. Therefore, whether the behavior of the device to be predicted is suspicious can be judged according to the own characteristic analysis result of the device security dimension and the own characteristic analysis result of the device behavior dimension, and then first label information of whether suspicious behavior exists is added to the device to be predicted.
And step S204, searching similar equipment of the equipment to be predicted through the first machine learning model according to the associated characteristic information between the equipment.
In this embodiment, according to the associated feature information between the devices, similar devices associated with the device to be predicted are searched, and tag information determined according to the own feature information of the device to be predicted is supplemented by the tag information of the similar devices, so that the device to be predicted can be more comprehensively depicted.
The associated characteristic information between the devices may be specifically determined by: and respectively comparing the IP addresses used by the equipment to be predicted and the sample equipment, the connected wireless networks, the connected base stations, the installed applications, the behavior data, the software parameters and/or the hardware parameters, and judging whether the equipment to be predicted and the sample equipment use the same IP address, whether the same wireless networks are connected, whether the same base stations are connected, whether the installed applications are similar, whether the behavior data are similar, whether the software parameters are similar and/or whether the hardware parameters are similar, so as to obtain the associated characteristic information between the equipment.
According to the associated feature information between the devices, the similar device to the device to be predicted is searched through the first machine learning model, the searching range is wide, and the searching result is more accurate. Training the first machine learning model by utilizing big data, specifically, marking whether the two sample devices are associated or not according to the associated characteristic information between the two sample devices to obtain an associated device marking result; and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result. The method comprises the steps of taking the associated characteristic information between every two sample devices as a piece of training data of a first machine learning model, marking whether the two sample devices have an associated relation according to the piece of training data, and further training the first machine learning model according to the training data and an associated marking result, so that parameters of the first machine learning model are continuously optimized, and the accuracy of a prediction result of the first machine learning model is improved.
Wherein the first machine learning model may be: support vector machines, decision trees, neural networks, random forests, XGBoost or gradient boost trees, or a fusion of the above models, which is not limited by the present invention.
Step S205, according to the label information of the similar equipment, obtaining second label information of the equipment to be predicted.
Specifically, all tag information of similar devices of the device to be predicted may be used as the second tag information of the device to be predicted, or only tag information of the similar devices different from the device to be predicted may be used as the second tag information of the device to be predicted. The invention is not limited in this regard.
It should be noted that, the sequence between the process of determining the first label information of each dimension according to the own feature information of the device to be predicted and the process of determining the second label information of the device to be predicted according to the associated feature information between the devices in the present invention is not limited.
And step S206, generating an equipment portrait of the equipment to be predicted according to the first label information and the second label information of each dimension of the equipment to be predicted.
And synthesizing the first label information and the second label information of each dimension of the equipment to be predicted to generate an equipment portrait of the equipment to be predicted, namely, a user portrait of a user using the equipment to be predicted.
According to the above, two ways of determining the portrait of the device according to the feature information of the device to be predicted can be determined, specifically as follows:
the method comprises the steps of analyzing the own characteristic information of the equipment to be predicted from multiple dimensions, determining first label information according to the own characteristic analysis results of the multiple dimensions, and generating an equipment image according to the first label information.
And in the second mode, the associated characteristic information of the equipment to be predicted is analyzed, second label information is determined according to the associated characteristic analysis result, and the equipment image is generated according to the second label information.
In practical applications, the device image of the device to be predicted may be generated according to the first or second method, or may be generated by combining the first and second methods, which is not limited in the present invention.
In summary, the method of the present embodiment processes the following three aspects, first: the method comprises the steps that massive characteristic information of equipment to be predicted is fully utilized, and the characteristic information is analyzed from multiple dimensions, so that the equipment to be predicted is depicted in multiple dimensions; secondly, the method comprises the following steps: analyzing whether equipment to be predicted has suspicious behaviors or not by combining a machine learning model and big data; thirdly, the method comprises the following steps: whether the equipment has an incidence relation or not is analyzed by combining the machine learning model and the big data, and the label information of the equipment to be predicted is supplemented according to the label information of the similar equipment, so that the generated equipment image is more comprehensive and accurate.
FIG. 3 shows a functional block diagram of a device representation generation apparatus according to yet another embodiment of the invention, as shown in FIG. 3, the apparatus comprising:
a feature extraction module 31 adapted to extract feature information of a device to be predicted;
the analysis module 32 is adapted to analyze the feature information of the device to be predicted from multiple dimensions respectively to obtain feature analysis results of the dimensions;
the label information determining module 33 is adapted to obtain label information of each dimension of the device to be predicted according to the feature analysis result of each dimension;
and the device portrait generating module 34 is adapted to generate a device portrait of the device to be predicted according to the label information of each dimension of the device to be predicted.
Optionally, the feature information of the device to be predicted includes: self characteristic information of the equipment to be predicted;
the analysis module 32 is further adapted to:
analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions;
the tag information determination module 33 is further adapted to: and obtaining first label information of each dimension of the equipment to be predicted according to the own feature analysis result of each dimension.
Optionally, the feature information of the device to be predicted further includes: associated feature information between devices;
the analysis module 32 is further adapted to: according to the associated feature information between the devices, searching for similar devices of the devices to be predicted through the first machine learning model;
the tag information determination module 33 is further adapted to: obtaining second label information of the equipment to be predicted according to the label information of the similar equipment;
then device representation generation module 34 is further adapted to: and generating a device portrait of the device to be predicted according to the first label information and the second label information of each dimension of the device to be predicted.
Optionally, the multiple dimensions specifically include:
a device geolocation dimension, an application installation situation dimension, a device behavior dimension, and/or a device security dimension;
the analysis module 32 is further adapted to:
analyzing the information of the IP address used by the equipment to be predicted, the information of the wireless network connected with the equipment to be predicted and the information of the base station connected with the equipment to be predicted to obtain the self-characteristic analysis result of the geographical position dimension;
and/or analyzing the application of the equipment installation to be predicted to obtain the own characteristic analysis result of the dimensionality of the application installation condition;
and/or analyzing the hardware parameter information of the equipment to be predicted to obtain a self-characteristic analysis result of the security dimension;
and/or analyzing whether the to-be-predicted equipment is provided with the information of the application which does not accord with the preset safety condition, the starting time information of the to-be-predicted equipment, the information whether the to-be-predicted equipment breaks the prison or not and/or the information whether the to-be-predicted equipment is provided with the simulator or not through a second machine learning model to obtain the own characteristic analysis result of the equipment behavior dimension.
Optionally, the analysis module 32 is further adapted to:
and respectively analyzing IMEI information, MAC address information, state information of an SIM card, electric quantity information, residual disk space information, total disk space information, resolution information and/or information of whether the brand and the model of the equipment to be predicted are matched, and obtaining a self-characteristic analysis result of whether hardware parameters of the equipment to be predicted are forged or not.
Optionally, the associated characteristic information between the devices includes one or more of the following information:
the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment.
Optionally, the apparatus further comprises:
the model training module 35 is adapted to mark whether the behavior of each sample device is suspicious according to the own characteristic information of the device behavior dimension of the sample device to obtain a device behavior marking result;
and training the second machine learning model according to the own characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result.
Optionally, the model training module 35 is further adapted to:
whether the two sample devices are associated or not is marked according to the associated characteristic information between every two sample devices, and an associated device marking result is obtained;
and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result.
Embodiments of the present invention provide a non-volatile computer storage medium, where at least one executable instruction is stored in the computer storage medium, and the computer executable instruction may execute the method for generating a device portrait in any of the above method embodiments.
Fig. 4 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.
As shown in fig. 4, the computing device may include: a processor (processor)402, a Communications Interface 404, a memory 406, and a Communications bus 408.
Wherein:
the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
The processor 402 is configured to execute the program 410, and may specifically execute relevant steps in the above-described method for generating a device representation.
In particular, program 410 may include program code comprising computer operating instructions.
The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 410 may be specifically configured to cause processor 402 to perform the method of generating a device representation in any of the method embodiments described above. For specific implementation of each step in the program 410, reference may be made to corresponding steps and corresponding descriptions in units in the above-mentioned embodiment of the method for generating an image of an apparatus, which are not described herein again.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in accordance with embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (16)

1. A method of device representation generation, the method comprising:
extracting characteristic information of equipment to be predicted; wherein the feature information of the device to be predicted comprises: associated feature information between devices;
according to the associated feature information between the devices, searching for similar devices of the device to be predicted through a first machine learning model;
obtaining second label information of the equipment to be predicted according to the label information of the similar equipment;
and generating a device portrait of the device to be predicted according to the second label information.
2. The method according to claim 1, wherein the feature information of the device to be predicted comprises: self characteristic information of the equipment to be predicted;
the method further comprises: analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions;
obtaining first label credits of each dimension of the equipment to be predicted according to the own feature analysis result of each dimension;
the generating of the device portrait of the device to be predicted according to the second tag information is specifically as follows: and generating a device portrait of the device to be predicted according to the first label information and the second label information of each dimension of the device to be predicted.
3. The method according to claim 2, wherein the plurality of dimensions specifically comprises:
a device geolocation dimension, an application installation situation dimension, a device behavior dimension, and/or a device security dimension;
then, the analyzing the own feature information of the device to be predicted from the multiple dimensions respectively to obtain the own feature analysis result of each dimension specifically includes:
analyzing the information of the IP address used by the equipment to be predicted, the information of the wireless network connected with the equipment to be predicted and the information of the base station connected with the equipment to be predicted to obtain the own characteristic analysis result of the geographical position dimension;
and/or analyzing the application installed by the equipment to be predicted to obtain the own characteristic analysis result of the dimensionality of the application installation condition;
and/or analyzing the hardware parameter information of the equipment to be predicted to obtain a self-characteristic analysis result of the security dimension;
and/or analyzing whether the device to be predicted is provided with the information of the application which does not meet the preset safety condition, the starting time information of the device to be predicted, whether the device to be predicted breaks the prison or not and/or whether the device to be predicted is provided with the simulator or not through a second machine learning model to obtain the own characteristic analysis result of the device behavior dimension.
4. The method of claim 3, wherein analyzing the hardware parameter information of the device to be predicted to obtain the own feature analysis result of the security dimension further comprises:
and respectively analyzing IMEI information, MAC address information, state information of an SIM card, electric quantity information, residual disk space information, total disk space information, resolution information and/or information of whether the brand and the model are matched with each other of the equipment to be predicted to obtain a self-characteristic analysis result of whether the hardware parameter of the equipment to be predicted is forged or not.
5. The method of claim 1, wherein the association characteristic information between the devices comprises one or more of the following information:
the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment.
6. The method of claim 3, further comprising:
whether the behavior of the sample equipment is suspicious is marked according to the own characteristic information of the equipment behavior dimension of each sample equipment, and an equipment behavior marking result is obtained;
and training the second machine learning model according to the self-characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result.
7. The method of claim 1, further comprising:
whether the two sample devices are associated or not is marked according to the associated characteristic information between every two sample devices, and an associated device marking result is obtained;
and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result.
8. An apparatus for generating a device representation, the apparatus comprising:
the characteristic extraction module is suitable for extracting the characteristic information of the equipment to be predicted; wherein the feature information of the device to be predicted comprises: associated feature information between devices;
the analysis module is suitable for searching similar equipment of the equipment to be predicted through a first machine learning model according to the associated characteristic information among the equipment;
the label information determining module is suitable for obtaining second label information of the equipment to be predicted according to the label information of the similar equipment;
and the equipment portrait generation module is suitable for generating an equipment portrait of the equipment to be predicted according to the second label information.
9. The apparatus of claim 8, wherein the feature information of the device to be predicted comprises: self characteristic information of the equipment to be predicted;
the analysis module is further adapted to:
analyzing the own characteristic information of the equipment to be predicted from multiple dimensions respectively to obtain own characteristic analysis results of the dimensions;
the tag information determination module is further adapted to: obtaining first label information of each dimension of the equipment to be predicted according to the own feature analysis result of each dimension;
the device representation generation module is further adapted to: and generating a device portrait of the device to be predicted according to the first label information and the second label information of each dimension of the device to be predicted.
10. The apparatus of claim 9, wherein the plurality of dimensions specifically comprise:
a device geolocation dimension, an application installation situation dimension, a device behavior dimension, and/or a device security dimension;
the analysis module is further adapted to:
analyzing the information of the IP address used by the equipment to be predicted, the information of the wireless network connected with the equipment to be predicted and the information of the base station connected with the equipment to be predicted to obtain the own characteristic analysis result of the geographical position dimension;
and/or analyzing the application installed by the equipment to be predicted to obtain the own characteristic analysis result of the dimensionality of the application installation condition;
and/or analyzing the hardware parameter information of the equipment to be predicted to obtain a self-characteristic analysis result of the security dimension;
and/or analyzing whether the device to be predicted is provided with the information of the application which does not meet the preset safety condition, the starting time information of the device to be predicted, whether the device to be predicted breaks the prison or not and/or whether the device to be predicted is provided with the simulator or not through a second machine learning model to obtain the own characteristic analysis result of the device behavior dimension.
11. The apparatus of claim 10, wherein the analysis module is further adapted to:
and respectively analyzing IMEI information, MAC address information, state information of an SIM card, electric quantity information, residual disk space information, total disk space information, resolution information and/or information of whether the brand and the model are matched with each other of the equipment to be predicted to obtain a self-characteristic analysis result of whether the hardware parameter of the equipment to be predicted is forged or not.
12. The apparatus of claim 8, wherein the associated feature information between the devices comprises one or more of the following:
the information of whether the IP addresses used by the equipment are the same, the information of whether the wireless networks connected with the equipment are the same, the information of whether the base stations connected with the equipment are the same, the information of whether the applications installed on the equipment meet the preset application similar conditions, the information of whether the behavior data of the equipment meet the preset behavior data similar conditions, the comparison result of software parameters among the equipment and the comparison result of hardware parameters among the equipment.
13. The apparatus of claim 11, further comprising:
the model training module is suitable for marking whether the behavior of each sample device is suspicious according to the self-characteristic information of the device behavior dimension of each sample device to obtain a device behavior marking result;
and training the second machine learning model according to the self-characteristic information of the equipment behavior dimension of the sample equipment and the equipment behavior marking result.
14. The apparatus of claim 8, wherein the model training module is further adapted to:
whether the two sample devices are associated or not is marked according to the associated characteristic information between every two sample devices, and an associated device marking result is obtained;
and training the first machine learning model according to the associated characteristic information between the two sample devices and the associated device marking result.
15. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the method of generating a device representation as claimed in any one of claims 1 to 7.
16. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to a method of generating a device representation as claimed in any one of claims 1 to 7.
CN201810818733.4A 2018-07-24 2018-07-24 Equipment portrait generation method and device and computing equipment Active CN109086377B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810818733.4A CN109086377B (en) 2018-07-24 2018-07-24 Equipment portrait generation method and device and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810818733.4A CN109086377B (en) 2018-07-24 2018-07-24 Equipment portrait generation method and device and computing equipment

Publications (2)

Publication Number Publication Date
CN109086377A CN109086377A (en) 2018-12-25
CN109086377B true CN109086377B (en) 2021-02-02

Family

ID=64838310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810818733.4A Active CN109086377B (en) 2018-07-24 2018-07-24 Equipment portrait generation method and device and computing equipment

Country Status (1)

Country Link
CN (1) CN109086377B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109857831A (en) * 2019-02-20 2019-06-07 云南电网有限责任公司信息中心 A kind of power equipment portrait label system building method based on big data technology
CN109919219B (en) * 2019-03-01 2021-02-26 北京邮电大学 Xgboost multi-view portrait construction method based on kernel computing ML-kNN
CN112118256B (en) * 2020-09-17 2023-03-24 浙江齐安信息科技有限公司 Industrial control equipment fingerprint normalization method and device, computer equipment and storage medium
WO2022096011A1 (en) * 2020-11-09 2022-05-12 华为云计算技术有限公司 Method and apparatus for accessing internet of things device
CN112100506B (en) * 2020-11-10 2021-03-16 中国电力科学研究院有限公司 Information pushing method, system, equipment and storage medium
CN112364008A (en) * 2020-11-20 2021-02-12 国网江苏省电力有限公司营销服务中心 Equipment portrait construction method for intelligent terminal of power internet of things
CN112465565B (en) * 2020-12-11 2023-09-26 加和(北京)信息科技有限公司 User portrait prediction method and device based on machine learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933946A (en) * 2017-01-20 2017-07-07 深圳市三体科技有限公司 A kind of big data management method and system based on mobile terminal
CN107807997A (en) * 2017-11-08 2018-03-16 北京奇虎科技有限公司 User's portrait building method, device and computing device based on big data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933946A (en) * 2017-01-20 2017-07-07 深圳市三体科技有限公司 A kind of big data management method and system based on mobile terminal
CN107807997A (en) * 2017-11-08 2018-03-16 北京奇虎科技有限公司 User's portrait building method, device and computing device based on big data

Also Published As

Publication number Publication date
CN109086377A (en) 2018-12-25

Similar Documents

Publication Publication Date Title
CN109086377B (en) Equipment portrait generation method and device and computing equipment
CN111401416B (en) Abnormal website identification method and device and abnormal countermeasure identification method
TWI743773B (en) Method and device for identifying abnormal collection behavior based on privacy data protection
CN108427731B (en) Page code processing method and device, terminal equipment and medium
US20190188729A1 (en) System and method for detecting counterfeit product based on deep learning
CN111163072B (en) Method and device for determining characteristic value in machine learning model and electronic equipment
CN107798001B (en) Webpage processing method, device and equipment
WO2015081720A1 (en) Instant messaging (im) based information recommendation method, apparatus, and terminal
US8706572B1 (en) Generating product image maps
CN106202101B (en) Advertisement identification method and device
CN108961019B (en) User account detection method and device
CN105306495A (en) User identification method and device
CN112328802A (en) Data processing method and device and server
CN112148305A (en) Application detection method and device, computer equipment and readable storage medium
CN108512822B (en) Risk identification method and device for data processing event
CN112749988A (en) Electronic ticket content display method and device
US20160124580A1 (en) Method and system for providing content with a user interface
CN109582834B (en) Data risk prediction method and device
CN108268545B (en) Method and device for establishing hierarchical user label library
CN105162799A (en) Method for checking whether client is legal mobile terminal or not and server
CN108255888B (en) Data processing method and system
CN113742559A (en) Keyword detection method and device, electronic equipment and storage medium
CN109359462B (en) Virtual standby identification method, equipment, storage medium and device
CN113297358A (en) Data processing method, device, server and computer readable storage medium
CN104965853A (en) Method and system for recommending aggregation application, method and device for aggregating various recommendation resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201217

Address after: 4f, building C2, Suzhou 2.5 Industrial Park, 88 Dongchang Road, Suzhou Industrial Park, Jiangsu Province, 215000

Applicant after: JIANGSU PAY EGIS TECHNOLOGY Co.,Ltd.

Applicant after: Beijing tongfudun Artificial Intelligence Technology Co., Ltd

Address before: Room 3f-301, building C2, Suzhou 2.5 Industrial Park, 88 Dongchang Road, Suzhou Industrial Park, Jiangsu Province

Applicant before: JIANGSU PAY EGIS TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant