CN114417944B

CN114417944B - Recognition model training method and device, and user abnormal behavior recognition method and device

Info

Publication number: CN114417944B
Application number: CN202011070544.7A
Authority: CN
Inventors: 樊鹏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-10-09
Filing date: 2020-10-09
Publication date: 2024-04-09
Anticipated expiration: 2040-10-09
Also published as: CN114417944A

Abstract

The disclosure relates to an artificial intelligence based recognition model training method and device, and a user abnormal behavior recognition method and device, and in particular relates to a machine learning aspect in artificial intelligence. The method for training the recognition model comprises the following steps: receiving at least two user data of at least two users, including data related to abnormal behavior of the users; extracting at least two user features from at least two user data; at least one of weighting a first user feature of the at least two user features or associating a second user feature of the at least two user features with a third user feature of the at least two user features is performed; combining at least two user features of at least one of the weighted or associated features; and deep learning the combined at least two user features.

Description

Recognition model training method and device, and user abnormal behavior recognition method and device

Technical Field

The present disclosure relates generally to the field of machine learning in artificial intelligence, and in particular, to an artificial intelligence-based recognition model training method and apparatus, and a user abnormal behavior recognition method and apparatus.

Background

In general, an electronic device may establish a connection with a cellular network or a wireless lan to implement a communication function, where the wireless lan that establishes a connection with the electronic device may be a public lan or a private lan, and the security of the public lan is generally lower than that of the private lan. Therefore, there are some WiFi housekeeping applications (apps) in reality, which support billions of public local area networks, and can perform comprehensive assessment on security, connection speed, network quality and the like of the public local area networks with higher standards, so as to ensure that no zombies, risks and false WiFi exist.

In order to increase the user's tackiness to the product, some user-motivated means are often designed in these applications. The user can complete the point task set in the application to acquire the points and obtain rewards in a point exchange mode. However, there are cases where the user maliciously takes points through abnormal behavior. For example, a user controls a plurality of electronic devices simultaneously based on a technical means, and automatically completes the integration tasks set in the application in batches, thereby realizing malicious acquisition operation of the integration.

In order to identify abnormal behavior of a user, there are an identification method based on a predetermined rule and an identification method based on data mining. In the recognition method based on the predetermined rule, whether the current user behavior is an abnormal behavior is calculated based on a data rule determined in advance manually. In the recognition method based on data mining, a data mining method of non-deep learning is adopted, and whether the current user behavior is abnormal behavior is calculated by constructing multidimensional features and training a model.

However, in the recognition method based on the predetermined rule, not only the number of the used rules is very limited, but also high-dimensional feature information of interactions between the rules cannot be captured, and most importantly, optimal parameters of each rule cannot be determined. In the recognition method based on data mining, although the data mining method based on non-deep learning can solve part of disadvantages in the recognition method based on the preset rule, analysis finds that in the scene such as user integral acquisition, the user behavior characteristics are complex, and the characteristic information is difficult to be obviously revealed in the aspect of data characterization. Therefore, the traditional data feature construction is not only poor in effect, but also very time-consuming, and the iteration speed of the model is affected. The main reasons for this are that traditional feature characterization methods and non-deep learning models are not suitable for these scenarios.

Disclosure of Invention

In view of this, the embodiment of the disclosure provides an artificial intelligence based recognition model training method and device, and a user abnormal behavior recognition method and device.

According to a first aspect of the present disclosure, there is provided a method for training a recognition model for recognizing abnormal behavior of a user, the method comprising: receiving at least two user data of at least two users, the at least two user data comprising data related to abnormal behavior of the users; extracting at least two user features from at least two user data; performing at least one of weighting a first user feature of the at least two user features, associating a second user feature of the at least two user features with a third user feature of the at least two user features; combining at least two user features of at least one of the weighted or associated features; and deep learning the combined at least two user features.

According to a second aspect of the present disclosure, there is provided a method for identifying user abnormal behavior, comprising: acquiring one or more current user behaviors, wherein the one or more current user behaviors are user behaviors of a user when accessing an application; extracting one or more current user behavior features from the one or more current user behaviors; generating a user behavior feature vector based on one or more current user behavior features; the user behavior feature vector is input to a user abnormal behavior recognition model to recognize the user abnormal behavior.

According to a third aspect of the present disclosure, there is provided an apparatus for training a recognition model for recognizing abnormal behavior of a user, the apparatus comprising: a user data management module configured to receive at least two user data of at least two users, the at least two user data including data related to user abnormal behavior; a user feature processing module configured to extract at least two user features from at least two user data; and a model training module configured to perform at least one of weighting a first user feature of the at least two user features, associating a second user feature of the at least two user features with a third user feature of the at least two user features; combining at least two user features of at least one of the weighted or associated features; and deep learning the combined at least two user features.

In some embodiments, the at least two user data are user data of at least two users of the application, and the user feature processing module is specifically configured to extract the at least two user features from the at least two user data by: extracting one or more user attribute features from data of the at least two user data related to user attributes of a first user of the application; extracting one or more user business behavior features from data of at least two user data related to the behavior of the first user's access application; extracting one or more user non-business behavior features from data of at least two user data related to a behavior of the first user other than the access application; and performing feature processing on the extracted one or more user attribute features, one or more user business behavior features and one or more user non-business behavior features to increase feature density; the extracted features include embedded features and non-embedded features.

In some embodiments, the apparatus for training the recognition model further comprises a sample generation module configured to generate at least two samples for training the recognition model from the at least two user features by: splicing one or more user attribute characteristics of the first user; splicing one or more user business behavior characteristics of the first user; splicing one or more non-business behavior characteristics of the first user; and splicing the spliced user attribute characteristics, the spliced user business behavior characteristics and the spliced user non-business behavior characteristics to form sample data of the first user.

In some embodiments, the model training module is specifically configured to weight a first user feature of the at least two user features by: compressing a user feature set in at least two user features by using a pooling operation to obtain a statistical vector for the user feature set; obtaining weight vectors for the set of user features by machine learning based on the statistical vectors; weighting at least a first user feature of the set of user features based on the weight vector, the weighting comprising: the user feature set is multiplied by a weight vector.

In some embodiments, the model training module is specifically configured to associate the second user feature with the third user feature by: associating the second user feature with the third user feature using bilinear manipulation, the bilinear manipulation comprising a first linear manipulation and a second linear manipulation that are different from each other, the first linear manipulation being an inner product calculation, the second linear manipulation being a hadamard product calculation, the associating the second user feature with the third user feature using bilinear manipulation specifically comprising: calculating an inner product of the second user feature and the cross matrix by a first linear operation; and computing a hadamard product of the inner product and the third user characteristic by a second linear function.

In some embodiments, the means for training the recognition model further comprises: a model evaluation module configured to evaluate the trained recognition model online and/or offline based on one or more evaluation metrics; the model training module is specifically configured to adjust parameters of the recognition model based on the evaluation result, the adjustment specifically comprising: and adjusting parameters of the identification model based on the online evaluation result and/or the offline evaluation result until the online evaluation result and/or the offline evaluation result is in a preset numerical range.

In some embodiments, the user data management module is specifically configured to receive at least two feedback from at least two users; and the model training module is specifically configured to adjust parameters of the recognition model based on at least two feedback including user complaints and user inactivity.

In some embodiments, the recognition model includes a depth portion and a non-depth portion, weighting a first user feature of the at least two user features being performed in the non-depth portion, the first user feature being the same or different user feature than a second user feature, the first user feature being the same or different user feature than a third user feature, the second user feature being a different user feature than the third user feature, associating the second user feature with the third user feature being performed in the non-depth portion; the recognition model includes a FibiNet model.

According to a fourth aspect of the present disclosure, there is provided an apparatus for identifying user abnormal behavior, comprising: a user data management module configured to obtain one or more current user behaviors, the one or more current user behaviors being user behaviors of a user when accessing an application; a user feature processing module configured to extract one or more current user behavior features from one or more current user behaviors; a user feature vector generation module configured to generate a user behavior feature vector based on one or more current user behavior features; and a user behavior recognition module configured to input the user behavior feature vector to the user abnormal behavior recognition model to recognize the user abnormal behavior.

In some embodiments, the user feature vector generation module is specifically configured to generate the user behavior feature vector by: retrieving historical behavior characteristics and user attribute characteristics of a user; and splicing the current user behavior characteristics, the historical behavior characteristics and the user attribute characteristics to obtain a user behavior characteristic vector.

In some embodiments, the user behavior recognition module is specifically configured to recognize user abnormal behavior by: determining the probability that the current user behavior is abnormal; when the probability is greater than the threshold, determining the current user behavior as abnormal behavior; limiting measures are taken for users with abnormal behaviors.

According to a fifth aspect of the present disclosure, there is provided a computing device comprising a processor and a memory having instructions stored thereon which, when executed on the processor, cause the processor to perform the above-described method for training a recognition model or method for recognizing user abnormal behaviour.

According to a sixth aspect of the present disclosure, there is provided one or more computer-readable storage media having instructions stored thereon that, when executed on one or more processors, cause the one or more processors to perform the above-described method for training a recognition model or method for recognizing user abnormal behavior.

According to the technical scheme of the disclosure, in the process of training the recognition model, at least one of weighting a first user characteristic of at least two user characteristics, and associating a second user characteristic of the at least two user characteristics with a third user characteristic of the at least two user characteristics is executed; combining at least two user features of at least one of the weighted or associated features; and deep learning is carried out on the combined at least two user features, so that the importance of the user features and the relevance between the user features can be better learned, the recognition rate of the trained recognition model on abnormal behaviors of the user (such as the recognition of malicious acquisition behaviors of the user) can be improved, the low-quality wool users can be filtered more effectively, the activity of the high-quality users is promoted, the computing resources and the network resources of the applied providers can be used for normal users more effectively, the operation cost of the applied providers is reduced, the operation efficiency is improved, and the retention of products is improved. In addition, since the provider of the application has relatively more computing resources and network resources available for the normal user, it also helps to improve the user experience of the normal user.

These and other aspects of the disclosure will be apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Further details, features, and advantages of the present disclosure are disclosed in the following description of exemplary embodiments, with reference to the following drawings, wherein:

FIG. 1 schematically illustrates an exemplary system for identifying user abnormal behavior in accordance with an embodiment of the present disclosure;

FIG. 2A schematically illustrates a flow chart of a method for training an identification model according to an embodiment of the present disclosure;

FIG. 2B schematically illustrates a flowchart of a method for identifying user abnormal behavior in accordance with an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart of a specific example of a method for identifying user abnormal behavior in accordance with an embodiment of the present disclosure;

FIG. 4 schematically illustrates an example interface of an application displayed on a display screen of an electronic device;

FIG. 5A schematically illustrates a flowchart of specific steps of a method for training an identification model according to an embodiment of the present disclosure;

FIG. 5B schematically illustrates a flowchart of specific steps of a method for identifying user abnormal behavior, in accordance with an embodiment of the present disclosure;

FIG. 6A schematically illustrates a schematic diagram of a specific operation of training a recognition model in accordance with an embodiment of the present disclosure;

FIG. 6B schematically illustrates a schematic diagram of bilinear operation according to an embodiment of the present disclosure;

FIG. 7 schematically illustrates a schematic diagram of a comparison of an identification method according to an embodiment of the present disclosure with an existing identification method;

FIG. 8A schematically illustrates a block diagram of an apparatus for training an identification model in accordance with an embodiment of the present disclosure;

FIG. 8B schematically illustrates a block diagram of an apparatus for identifying user abnormal behavior in accordance with an embodiment of the present disclosure; and is also provided with

Fig. 9 schematically illustrates a block diagram of a computing device according to an embodiment of the disclosure.

Detailed Description

Before describing in detail embodiments of the present disclosure, some related concepts will be explained first.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Deep Learning (DL) is a new research direction in the field of machine Learning, and it was introduced into machine Learning to make it closer to the original target, i.e., artificial intelligence. Deep learning is the inherent regularity and presentation hierarchy of learning sample data, and the information obtained during such learning is helpful in interpreting data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data. Deep learning is a complex machine learning algorithm that achieves far greater results in terms of speech and image recognition than prior art. Deep learning has achieved many results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech, recommendation and personalization techniques, and other related fields. The deep learning makes the machine imitate the activities of human beings such as audio-visual and thinking, solves a plurality of complex pattern recognition problems, and makes the related technology of artificial intelligence greatly advanced.

User abnormal behavior, also referred to as abnormal user behavior, refers to unexpected behavior of the user, such as "point malicious pickup" behavior of the user described below, which does not conform to the usage rules of the application in the process of using the application, and such behavior generally increases the operation cost of the provider of the application, reduces the operation efficiency, and also reduces the user experience of the normal user.

Embodiments of the present disclosure provide solutions related to techniques such as machine learning of artificial intelligence, and are specifically described by the following embodiments.

Fig. 1 schematically illustrates an exemplary system for identifying user abnormal behavior (i.e., an abnormal behavior identification system) 100 in which the various methods described herein may be implemented in accordance with embodiments of the present disclosure. As shown in fig. 1, the abnormal behavior recognition system 100 includes a recognition model training server 110, an abnormal behavior recognition server 111, a data storage server 112, one or more electronic devices 130, and optionally a network 120.

The recognition model training server 110, the abnormal behavior recognition server 111, and the data storage server 112 store and execute instructions that can perform the various methods described herein, which can be a single server or a cluster of servers, respectively, or any two or three of which can be the same server or the same cluster of servers. It should be understood that the servers referred to herein are typically server computers having a large amount of memory and processor resources, but other embodiments are also possible.

Examples of network 120 include a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), and/or a combination of communication networks such as the internet. The recognition model training server 110, the abnormal behavior recognition server 111, and the data storage server 112 may each include at least one communication interface (not shown) capable of communicating over the network 120. Such communication interfaces may be one or more of the following: any type of network interface (e.g., a Network Interface Card (NIC)), a wired or wireless (such as IEEE 802.11 Wireless LAN (WLAN)) interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, bluetooth, etc ^TM An interface, a Near Field Communication (NFC) interface, etc.

Electronic device 130 may be any type of mobile computing device, including a mobile computer or mobile computing device (e.g.,devices, personal Digital Assistants (PDAs), laptop computers, notebook computers, such as Apple iPad ^TM Tablet computer, netbook, etc.), mobile telephone (e.g., cellular telephone, such as Microsoft (r) corporation, netbook, etc.)Smart phone of telephone, apple iPhone, realize +.>Android ^TM Telephone of operating system,Device (S)>Devices, etc.), wearable computing devices (e.g., smart watches, headsets, including smart glasses, such as +.>Glass ^TM Etc.) or other types of mobile devices. Electronic device 130 may also be any type of stationary computing device, such as a desktop computer. Further, at least two of the electronic devices 130 may be the same or different types of computing devices.

The electronic device 130 may include a display screen 131 and an application 132 that may interact with a user via the display screen 131. Electronic device 130 may interact with, e.g., send data to or receive data from, one or more of recognition model training server 110, abnormal behavior recognition server 111, data storage server 112, e.g., via network 120. The application 132 may be a local application, a Web page (Web) application, or an applet (litkapp) as a lightweight application. In the case where the application 132 is a native application that needs to be installed, the application 132 may be installed in the electronic device 130. In the case where the application 132 is a Web application, the application 132 may be accessed through a browser. In the case where the application 132 is an applet, the application 132 may be opened directly on the electronic device 130 by searching for relevant information of the application 132 (e.g., the name of the application 132, etc.), scanning a graphical code of the application 132 (e.g., a bar code, a two-dimensional code, etc.), etc., without installing the application 132.

It should be appreciated that while the recognition model training server 110, the abnormal behavior recognition server 111, the data storage server 112, and the electronic device 130 are shown and described herein as separate structures, they may be different components of the same computing device. For example, the recognition model training server 110, the abnormal behavior recognition server 111, the data storage server 112 may provide background computing functionality, while the electronic device 130 may provide foreground display functionality.

Fig. 2A schematically illustrates a flowchart of a method 200 for training a recognition model for recognizing user abnormal behavior, in accordance with an embodiment of the present disclosure. The method 200 for training the recognition model may be implemented on the recognition model training server 110 or on the electronic device 130. For example, in one embodiment, the method 200 for training the recognition model is implemented on the recognition model training server 110 and sent to the abnormal behavior recognition server 111 and/or the electronic device 130 after the recognition model training is completed in order to recognize the user's abnormal behavior. In another embodiment, method 200 for training the recognition model is implemented on electronic device 130 and recognizing the user's abnormal behavior is implemented locally on electronic device 130, at which point recognition model training server 110 and network 120 shown in FIG. 1 may not be required. In yet another embodiment, the method 200 for training the recognition model is implemented on one electronic device 130 and transmitted to another electronic device 130 after recognition model training is completed in order to identify the user's abnormal behavior. In further embodiments, the method 200 for training the recognition model may also be performed by the recognition model training server 110 and the electronic device 130 in combination. The implementation location of training of the recognition model is not exhaustive here.

At step 201, at least two user data for at least two users are received, the at least two user data comprising data related to user abnormal behavior. In some embodiments, the at least two user data are user data of at least two users of the application 132.

At step 202, at least two user features are extracted from at least two user data. The extracted features include embedded features and non-embedded features.

At step 203, at least one of the following is performed: weighting a first user feature of the at least two user features; or associating a second user feature of the at least two user features with a third user feature of the at least two user features.

At step 204, the at least two user features of at least one of weighted or associated are combined.

At step 205, the at least two user features combined are deep learned.

Fig. 2B schematically illustrates a flowchart of a method 210 for identifying user abnormal behavior in accordance with an embodiment of the present disclosure. The method 210 for identifying abnormal behavior of a user may be implemented on the abnormal behavior identification server 111 or on the electronic device 130. The implementation location of the recognition of abnormal behavior is not exhaustive here.

At step 211, one or more current user behaviors are obtained. In some embodiments, the one or more current user behaviors are user behaviors of the application 132 when the user accesses the application 132.

At step 212, one or more current user behavior features are extracted from the one or more current user behaviors.

At step 213, a user behavior feature vector is generated based on one or more current user behavior features using feature stitching.

At step 214, the user behavior feature vector is input to the user abnormal behavior recognition model to recognize the user abnormal behavior.

Fig. 3 schematically illustrates a flow chart of a specific example of a method 300 for identifying user abnormal behavior in accordance with an embodiment of the disclosure. The following description will take as an example that the method steps 301-306 for training the recognition model are implemented on the recognition model training server 110, and the method steps 307-310 for recognizing the user's abnormal behavior are implemented on the abnormal behavior recognition server 111, and take as an example that the application 132 is a WiFi housekeeping application, and the user's abnormal behavior is a malicious point of getting the user, in connection with fig. 4, fig. 5A-5B, and fig. 6A-6B. It should be appreciated that all method steps 301-311 may also be implemented on the abnormal behavior recognition server 111 or on the electronic device 130.

FIG. 4 schematically illustrates an example interface 400 of an application 132 displayed on a display screen 131 of an electronic device 130; FIG. 5A schematically illustrates a flowchart of specific steps 510-530 of a method for training an identification model according to an embodiment of the present disclosure; FIG. 5B schematically illustrates a flowchart of specific steps 540-550 of a method for identifying user abnormal behavior, in accordance with an embodiment of the present disclosure; FIG. 6A schematically illustrates a schematic diagram of a specific operation of training a recognition model in accordance with an embodiment of the present disclosure; fig. 6B schematically illustrates a schematic diagram of bilinear operation according to an embodiment of the present disclosure.

It should be understood that the step numbers used herein are merely used to distinguish between different steps and are not intended to indicate that the steps need to be performed in the numbered order. Conversely, steps may be performed in an order different from the numbering order, certain steps may be performed in parallel, and certain steps may be performed repeatedly.

The method 300 may provide a configuration interface that may provide interactions for the purpose of configuring parameters. For example, entities or individuals, including but not limited to recognition model providers, abnormal behavior recognition providers, providers of applications, and the like, may configure various parameters through a configuration interface. In some embodiments, the configuration interface may be implemented as an interactive interface that includes options or input boxes.

At step 301, the recognition model training server 110 may receive at least two user data for at least two users. The at least two user data includes data related to user abnormal behavior. In some embodiments, the at least two user data are user data of at least two users of the application 132. To facilitate understanding of the technical solutions of the present disclosure, the abnormal behavior of the user will be described in detail herein with reference to the example interface 400 of the application 132 shown in fig. 4, taking the malicious point acquisition behavior of the user as an example.

As shown in fig. 4, in the upper left corner of the interface 400, user ID or nickname information 401, such as "WiFi little man" may be displayed. A button 402 may also be displayed on the interface 400, and a prompt "check-in collar score" may be displayed on the button 402 to prompt the user that the button may be clicked for daily check-ins. After the user has checked in, a "checked in" may be displayed on this button 402. In the upper right corner of the interface 400, a setup button may also be displayed that the user may click on to set the application 132, including setting personal information of the user, information of other applications (apps) associated, home WiFi information of the user, and so forth. At least two function buttons 403 may also be displayed on the interface 400, such as "game acceleration", "security physical examination", "game gift bag", "application acceleration", "my home WiFi", which buttons 403 may be clicked by the user to jump the interface to the corresponding function interface.

On interface 400, a "WiFi-specific benefit" tab 404 may also be displayed to prompt the user to pick up points by performing various point tasks. For example, a user may acquire 5 points by daily connection to WiFi, 2 points by accelerating the currently connected WiFi, and 5 points by reading two counsels. Specifically, as prompted by the "welfare upgrade" tab 405, the user may obtain 2 points by looking at the recommended other applications (apps), which may be done 2 times a day; the user can take 2 points by watching a short video, which can be done 5 times a day. On the interface 400, information on the number of points currently owned by the user may also be displayed, for example, 70 points currently owned by the user.

In some embodiments, the points, such as "WiFi coins," may be exchanged with the provider of the application 132 in exchange for virtual and/or physical rewards. For example, as prompted by the "good gift time redemption" tab 406, the user may exchange (point redemption) with the provider of the application 132 using the 1250 points owned to obtain monthly members for other applications (e.g., video applications) provided by the provider of the application 132.

In some embodiments, buttons such as "My," "Wifi," "found," etc., are displayed at the lower portion of the interface, and the user may click on these buttons to toggle the interface. The My button is highlighted to indicate that it is the currently selected button.

As described above, the daily number of active users, the online time of the user, and the data traffic of the application 132 are facilitated to be increased by the user checking in daily and performing various integrating tasks; the user may also be rewarded with virtual and/or physical rewards by way of redemption. However, some users operate a plurality of electronic devices simultaneously by technical means, and perform the daily check-in and complete various kinds of integration tasks in batches, so that the integration is rapidly taken. Such behavior of the user may be referred to as "point malicious pickup" behavior, which is unexpected behavior of the user in using the application 132 that does not conform to the usage rules of the application (i.e., user abnormal behavior), which makes computing resources and network resources of the provider of the application unable to concentrate on normal users, increases the operation cost of the provider of the application 132, reduces operation efficiency, and also reduces user experience of normal users. The embodiment of the disclosure provides a method and a device for identifying abnormal behaviors of a user, and a training method and a device for a related model, which apply a deep learning modeling scheme to a parallel wool user identification service scene of integral malicious acquisition, and can quickly build an automatic identification system (parallel wool user automatic identification system) of accurate, effective and high-reusability malicious integral acquisition required by a client product.

Returning to step 301, in some embodiments, the recognition model training server 110 may obtain the at least two user data from the data storage server 112. In further embodiments, the recognition model training server 110 may also obtain the at least two user data directly from the electronic device 130. In some embodiments, user data may be accumulated in the data storage server 112 as shown in fig. 5A, 5B, and then the recognition model training server 110 may obtain the accumulated user data in step 301.

For example, as shown in FIG. 5A, in raw data accumulation 510, log data requested in real-time on a user line may be stored to data storage server 112 (step 511), such as a Hadoop distributed file system (Hadoop Distributed File System, HDFS). In view of storage costs and subsequent computational efficiency, log critical information may be extracted based on Hive SQL (structured query language ), and redundant data discarded (step 512).

As another example, as shown in fig. 5B, in offline section 550, the accumulated data may include user portrait data and user behavioral feedback on model predictions, such as data stored in correlation table_1, correlation table_2, correlation table_n. For example, rich user portrayal data may be constructed based on user historical behavior data, including: user base attributes, device base attributes, network connection attributes, user history acquisition points preferences, behavior trace preferences, software usage preferences, and the like. The data accumulation may also include a user historical points retrieval behavior data store (step 551). As another example, feedback that the user historically has made for each "malicious score pickup decision" may be stored on the HDFS. For example, the feedback of the result of the 'malicious integral acquisition' judgment output by the recognition model in different time windows (the last half year, the last three months, the last month, the last 1 week and the last 3 days) and in different time periods (rest time, activity time, weekend time, working day time and the like) can be counted off line at fixed time.

Returning to FIG. 3, at step 302, the recognition model training server 110 may extract at least two user features from at least two user data. In some embodiments, the recognition model training server 110 may extract at least two user features from at least two user data by: extracting one or more user attribute features from data of the at least two user data related to user attributes of a first user of the application; extracting one or more user business behavior features from data of at least two user data related to the behavior of the first user's access application; one or more user non-business behavior features are extracted from data of the at least two user data related to a behavior of the first user other than the access application.

In some embodiments, user features may be extracted from user data in the following manner.

One-Hot Encoding (One-Hot Encoding). Also known as one-bit valid encoding, is to encode N states using N-bit state registers, each state having a separate register bit, and only one of the bits is valid at any time. For example, females may be encoded as [0,1], and males may be encoded as [1,0]. The single-heat encoding is suitable for information with a small information amount, such as the sex of the user. However, the feature vector obtained by the one-time heat encoding has a relatively large number of data items (high feature dimension), and has a relatively low feature density, and feature processing is required to increase the feature density, for example, female features are processed as [0], and male features are processed as [1].

Frequency Encoding (Count Encoding). For example, for a user's WiFi POI (place of interest, point of Interest) feature, frequency coding may be used to identify the user and the level of interest of this POI. For example, the user has consumed 3 times the POI of 'food-Chinese dish-Yue dish'.

Category embedding (Category Embedding). According to data analysis, many types of features are found to have strong sparsity. By applying category embedding, the high-dimensional sparse classification variable can be converted into the low-dimensional dense embedded variable through the neural network, so that model overfitting is avoided and model stability is improved.

NaN intercalation (NaN intercalation). By adopting a NaN embedding mode, the missing value is converted into embedded expression, and compared with methods such as 'reject', 'average value fill' and 'missing mark', and the like, the method has larger forward benefit on the effect of the model in the PUSH content scene.

Merging codes (Consolidation Encoding). At least two values under certain category variables may be combined into the same information. For example, for three versions of "4.2", "4.4", and "5.0" in at least two values of the system version feature of the android phone, it can be generalized to "low-version android system". Compared with the method for directly performing the single-hot coding on the android system version, the processing mode of the feature induction can bring larger forward benefits.

Feature Scaling (Feature Scaling). According to the distribution condition of the numerical type features, a proper normalization method is selected to eliminate the dimension difference between the features, so that the model can be more stable. For example, gaussian normalization is selected for features that fit or approximate fit a normal distribution.

WiFi track embedding (WiFi Trajectory Embedding). The WiFi trajectory data of the user may be embedded based on a minimum spanning tree convolutional neural network (MST-CNN) deep learning network, thereby capturing Pattern information of the user behavior.

App traffic embedding (App Traffic Embedding). Based on a List-Embedding (List-Embedding) mode, embedding and extracting the flow of the user by adopting a behavior sequence, so that the user behavior characteristics with low dimension and dense can be obtained.

The point action embedding is taken. Based on a Convolutional Neural Network (CNN) and a Long Short-Term Memory neural network (LSTM), the behavior features of the user before and after the integration is acquired are embedded and extracted, and the low-dimensional dense user behavior features containing time sequence information are obtained.

Features extracted by the above manner may be represented in a vector manner, such as [ a ] ₀ ,a ₁ ,...,a _n ]. Thus, a feature may be referred to as a "feature vector". The extracted features can be classified, for example, the gender feature of the user and the low-version android mobile phone feature are classified as user attribute features, the POI of going through 'delicious food-Chinese dish-Yue dish' for three times a week is classified as user non-business behavior feature, and the WiFi track of the user, the flow of the application of the user and the point acquisition behavior are classified Classified as user business behavior features. As described above, the extracted features may be classified into embedded features and non-embedded features according to the feature extraction method employed.

For example, as shown in fig. 5A, in the raw data accumulation section 510, the recognition model training server 110 may perform feature extraction (step 513), and store the extracted features in the data storage server 112 (step 514). As another example, as shown in fig. 5B, in the offline section 550, the user behavior feature may be extracted from the user history score acquisition behavior data (step 552), and the device portrait feature, the user behavior portrait feature, and the user recent score acquisition feature of the user may be extracted from the user data stored in the correlation table_1, the correlation table_2, and the correlation table_n (step 555).

In some embodiments, the recognition model training server 110 may further perform feature processing on the extracted one or more user attribute features, one or more user business behavior features, and one or more user non-business behavior features to increase feature density; feature processing includes processing of embedded features and processing of non-embedded features.

For example, as shown in fig. 5A, in the data feature engineering section 520, an appropriate feature processing method may be performed according to the data characteristics of the original feature stored in the data storage server 112 (e.g., HDFS). For example, for non-embedded features, the computation may be performed by a conventional feature engineering method for non-embedding based on Spark calculation engine (step 521), and the result stored in HDFS (step 522); for embedded features, the computation may be performed by embedded deep learning feature engineering methods based on the TensorFlow computation engine (step 523) and the results stored in the HDFS (step 524). In addition, features for literal description classes can be processed with a token into integer sequences/matrices (Int sequences/matrices) as embedded features.

For another example, as shown in FIG. 5B, in offline portion 550, feature conversion may be performed (step 553) to obtain a user behavior feature vector (step 554).

Returning to FIG. 3, at step 303, the recognition model training server 110 may generate at least two samples for training the recognition model from at least two user features. In some embodiments, at least two samples are generated from at least two user features by: splicing one or more user attribute characteristics of the first user; splicing one or more user business behavior characteristics of the first user; splicing one or more non-business behavior characteristics of the first user; and splicing the spliced user attribute characteristics, the spliced user business behavior characteristics and the spliced user non-business behavior characteristics to form sample data of the first user.

For example, as shown in fig. 5A, in the model training and evaluation section 530, a concatenation of features may be performed for the extracted user features of the same user (step 531), thereby forming sample data for training the recognition model. Here, if the extraction of the features has not been performed before, the extraction of the features and the stitching of the features should be performed in combination in this step 531.

As another example, as shown in fig. 5B, in offline portion 550, stitching of features may be performed (step 556). For example, the device portraits, the user behavior portraits, and the recent integration receiving characteristics of the user are extracted in step 555, and can be spliced (step 556: portraits feature splicing) to obtain the device portraits, the user behavior portraits, and the recent integration receiving portraits of the user (step 557); the stitched user multidimensional image is then stitched with the user behavior feature vectors obtained in step 554 (step 558: user portrait vector stitching), thereby forming sample data for training the recognition model.

Returning to FIG. 3, at step 304, the recognition model training server 110 may train the recognition model using at least two samples, the training specifically including at least one of: a first user characteristic of the at least two user characteristics is weighted and a second user characteristic of the at least two user characteristics is associated with a third user characteristic of the at least two user characteristics.

For example, as shown in fig. 5A, in the model training and evaluation section 530, model training may be performed with respect to the formed sample data (step 532). For another example, as shown in FIG. 5B, in offline portion 550, model training may be performed on the formed sample data (step 559). For example, features may be read out of HDFS to the local based on Hive SQL, and then model training based on TensorFlow.

In some embodiments, as shown in fig. 6A, model construction can be performed based on a FiBiNet model. In the FiBiNet model, two structures of a compression-excitation network (SENET) layer and a Bilinear Interaction (Bilinear Interaction) layer are added, the importance of features is dynamically learned by using the SENET structure, and cross features between features are learned by using a Bilinear function. Thus, the importance of the features and the intersections between the features can be taken into account, thereby improving the recognition performance of the trained model. Furthermore, by learning the cross-features between features using bilinear functions, the cross-features between features can be further learned, as compared to computing the cross-features using the inner product or hadamard product of feature vectors, thereby further improving the recognition performance of the trained model. It should be appreciated that in some embodiments, only the SENET layer, or the bilinear interaction layer, need be added to improve the recognition performance of the trained model relative to existing methods.

In some embodiments, the recognition model includes a depth portion and a non-depth portion, weighting a first user feature of the at least two user features being performed in the non-depth portion, and associating a second user feature with a third user feature being performed in the non-depth portion. In some embodiments, the first user characteristic and the second user characteristic may be the same or different user characteristics; the first user characteristic and the third user characteristic may be the same or different user characteristics; and the second user characteristic and the third user characteristic may be different user characteristics. For example, as shown in fig. 6A, in the FiBiNet model, it can be divided into a non-depth portion 610 and a depth portion 620, and the non-depth portion 610 includes a SENET layer and a bilinear interaction layer.

In some embodiments, as shown in fig. 6A, the non-depth portion 610 includes a sparse input layer 611, an embedded layer 612, a bilinear interaction layer 613, and a combination layer 614. At the embedded layer 612, a SENET layer 615 may be further included. In the sparse input layer 611, at least two samples may be input, which contain at least two features, each feature (feature vector) may be implemented in the form of a field, such as field 1, field 2, …, field f. At the embedding layer 612, features may be embedded, resulting in embedded features 631. Furthermore, at the embedding layer 612, the embedded features 631 may be further embedded by the SENET layer 615 using a compression-Excitation (SE) method, such as weighting at least a first one of the features, thereby taking into account the importance of the features themselves. By further embedding using the SE method, SE-like embedded features 632 may be obtained. In bilinear interaction layer 613, bilinear operations may be performed separately for embedded feature 631 and SE embedded feature 632 resulting in embedded layer 612, so that interaction features between features can be recognized. In the combining layer 614, the bilinear-operated embedded features and the bilinear-operated SE-embedded features may be combined.

In some embodiments, as shown in fig. 6A, at least two hidden layers 621 (hidden layers) are included in the depth portion 620, such as layer 1, layer 2, …, layer L. The at least two hidden layers 621 are fully connected layers that enable deep learning of the features combined via the combining layer 614, so that a model for identifying whether the user's abnormal behavior feature is included in the at least two features inputted can be obtained. It should be appreciated that the depth portion may also be configured as various other depth neural networks (Deep Neural Networks, DNN), which will not be described in detail in this disclosure.

In some embodiments, a first user feature of the at least two user features is weighted by: compressing a user feature set in at least two user features by using a pooling operation to obtain a statistical vector for the user feature set; obtaining weight vectors for the set of user features by machine learning based on the statistical vectors; weighting at least a first user feature of the set of user features based on the weight vector, the weighting comprising: the user feature set is multiplied by a weight vector.

In some embodiments, in the SENET layer 615 as shown in fig. 6A, the importance of different features may be learned, the important features weighted, and/or the less informative features attenuated. In some embodiments, an embedded vector e= [ E1,..mu.ei,..ef ] (e.g., embedded feature 631) of the user feature set (also referred to as a feature set) may be used as input, to generate feature set weight vector a= [ a1,..ai,..af ], then multiplying embedded vector E by a to yield SE-type embedded vector v= [ V1,..vi,..vf ] (e.g., SE-type embedded feature 632).

For example, the following steps may be employed to perform the feature weighting operation described above.

Compression (Squeeze): and summarizing and counting the embedded vectors E in the feature group. For example, a pooling operation may be used to compress the embedded vector e= [ E1 ], ei, resulting in a statistical vector z= [ Z1,., zi,..zf ], where zi represents global information of the ith feature. zi may be calculated by means of average pooling, or by means of maximum pooling, which is not intended to be exhaustive in this disclosure.

Excitation (specification): based on the embedded vector e= [ E1,..and/or, ei,..and/or, ef ], the statistical vector z= [ Z1,..and/or, zi,..zf ] in the feature set, the importance weight of the feature set is learned (weight vector a= [ a1,..and/or, ai,..af ]). For example, the weight vector a may be learned by at least two samples using a two-layer neural network in which the first layer is a dimension reduction layer and the second layer is a dimension promotion layer. In addition, other ways may be used to learn the weight vector a, which are not described herein.

Re-weighting (Re-Weight): the embedded vector E is re-weighted with the weight vector a= [ a1,..ai,..af ] resulting from the excitation operation. For example, the embedded vector E may be weighted by multiplying the embedded vector E by the weight vector a.

By the above method, information about the importance of the feature (Feature Importance) can be obtained by re-embedding the embedded feature with the SENET layer.

In some embodiments, the second user feature is associated with the third user feature by: associating the second user feature with the third user feature using bilinear manipulation, the bilinear manipulation comprising a first linear manipulation and a second linear manipulation that are different from each other, the first linear manipulation being an inner product calculation, the second linear manipulation being a hadamard product calculation, the associating the second user feature with the third user feature using bilinear manipulation specifically comprising: calculating an inner product of the second user feature and the cross matrix by a first linear operation; and computing a hadamard product of the inner product and the third user characteristic by a second linear function. By setting the second user feature and the third user feature to be user features different from each other, transient correlations between features can be prevented, preventing model overfitting from occurring.

In some embodiments, fig. 6B illustrates inner product computation, hadamard product computation, and bilinear operation. In the inner product calculation (e.g., FM, FFM, etc.), for example, for the input 4×1 vector and 1×4 vector, a 1×1 vector can be obtained. In hadamard product computation (e.g., FM, FFM, etc.), for example, 4×1 vectors and 4×1 vectors for the input may be obtained. Specific details of the inner product calculation and the hadamard product calculation of the two vectors are not described here. In some embodiments, feature crossings can be better modeled on sparse data by employing bilinear operations, combining inner product calculations and hadamard product calculations, and introducing an additional parameter matrix W (such as the 4 x 4 matrix shown in fig. 6B) to learn the feature crossings.

For example, as shown in fig. 6B, in bilinear operation, the input 4×1 vector (the second user feature) and the parameter matrix W may be first subjected to inner product calculation, and then the obtained inner product and the input another 4×1 vector (the third user feature) may be subjected to hadamard product calculation, so as to obtain the output 4×1 vector. Regarding the setting of the parameter matrix W, the same parameter matrix W may be adopted for all the combination modes of all the input vectors; the parameter matrix W1 may also be employed for one or more combinations and the parameter matrix W2 may be employed for other combinations, which disclosure is not intended to be exhaustive.

By the method, the intersection between the features can be obtained by utilizing the bilinear interaction layer to learn the intersection between the features for the embedded features and the SE type embedded features respectively, so that the abnormal behavior of the user can be recognized.

After the above-described setting of the non-depth portion 610, the recognition model may be trained with the at least two samples entered through at least two hidden layers 621 contained in the depth portion. For example, at least two input user samples may be used to set the weight vector a, the parameter matrix W, and the reduction Ratio (reduction Ratio) of the dimension reduction layer in an iterative manner, so that the recognition model has the best recognition effect on the abnormal behavior of the user. For example, a better recognition effect can be obtained when the reduction ratio is set to 6 to 8. It should be appreciated that a variety of training methods for Deep Neural Networks (DNNs) may be employed to train the recognition model, which will not be described in detail in this disclosure.

Through this step 304, when training the recognition model using at least two samples, the recognition model training server 110 weights a first user feature of the at least two user features, and/or the recognition model training server 110 correlates a second user feature of the at least two user features with a third user feature of the at least two user features, so that the recognition rate of the resulting recognition model can be improved by considering the importance degree of the input feature itself and the cross feature between the features in the model training. In some embodiments, a FiBiNet-based recognition model can be employed for scenes that contain a large number of classification features (Categorical Feature) and these feature bases (Feature Cardinality) are themselves high. In other embodiments, for scenes with more dense features and more numerical features with many outliers distributed, recognition models based on other deep learning network structures may be used.

Returning to FIG. 3, at step 305, the recognition model training server 110 may perform online and/or offline evaluations of the trained recognition model based on one or more evaluation metrics to determine whether the recognition model is expected. In some embodiments, the recognition model training server 110 may adjust parameters of the recognition model based on the evaluation result, the adjusting specifically including: and adjusting parameters of the identification model based on the online evaluation result and/or the offline evaluation result until the online evaluation result and/or the offline evaluation result is in a preset numerical range. In some embodiments, as shown in FIG. 3, if the recognition model does not reach the desired level, the method 300 returns to step 304 to re-train the recognition model until the recognition model reaches the desired level.

For example, as shown in FIG. 5A, in model training and evaluation portion 530, model online evaluation (step 534) and/or offline evaluation (step 533) may be performed with respect to the recognition model trained in step 532. If the model effect does not reach the expected (step 535), then a return may be made to step 532 to make parameter adjustments to the recognition model based on the evaluation results. Therefore, the recognition model can have a better recognition effect. As another example, as shown in FIG. 5B, in offline portion 550, model online evaluation (step 561) and/or offline evaluation (step 560) may be performed with respect to the recognition model trained in step 559. It should be understood that the evaluation may be performed either off-line or on-line, and the disclosure is not limited in this regard.

For example, in steps 533 and 560, the recognition model may be evaluated offline using a series of mathematical indicators. For example, the evaluation can be performed using an Area under the Curve (AUC), a kene normalization (Gini Normalization), or the like. In the case of using AUC, the larger the AUC value, the more likely the trained recognition model will be to rank positive samples (user abnormal behavior) before negative samples (user normal behavior), resulting in better recognition results. In the case of using the keni normalization, the larger the value, the better the recognition effect of the trained recognition model is explained.

For example, in steps 534 and 561, the recognition model may be evaluated online. For example, an AB Test (A/B Test) may be performed, i.e., the users are divided into group A and group B, user behavior recognition is performed for group A users using the trained model, user behavior recognition is performed for group B users using other models, and the indices of the two groups are evaluated. Metrics that may be used for evaluation include, but are not limited to: user complaint rate, average retention of users for multiple days.

Returning to FIG. 3, if it is determined at step 305 that the recognition model is expected, the method 300 proceeds to step 306, where the recognition model training server 110 may solidify the trained recognition model. For example, as shown in FIG. 5B, in offline portion 550, parameters of the trained model may be set to fixed parameters based on the results of the online evaluation (step 561) and/or the offline evaluation (step 560). Thus, the recognition model training server 110 may end training of the recognition model, may store the cured training model in the data storage server 112, or may issue to the abnormal behavior recognition server 111 or the electronic device 130.

Returning to FIG. 3, at step 307, the abnormal behavior recognition server 111 may obtain one or more current user behaviors, which are user behaviors of the user when accessing the application. For example, as shown in fig. 5B, in online portion 540, abnormal behavior recognition server 111 may receive one or more user behaviors, such as a point pickup behavior that the user has just performed, from electronic device 130 each time the user operates application 132.

At step 308, the abnormal behavior recognition server 111 may extract one or more current user behavior characteristics from the one or more current user behaviors. For example, features may be extracted from one or more current user points of acquisition and feature processing in a similar manner to step 302.

At step 309, the abnormal behavior recognition server 111 may generate a user behavior feature vector based on one or more current user behavior features. In some embodiments, the abnormal behavior recognition server 111 may generate the user behavior feature vector by: retrieving (retriever) historical behavioral characteristics and user attribute characteristics of the user; and splicing the current user behavior characteristics, the historical behavior characteristics and the user attribute characteristics to obtain a user behavior characteristic vector.

For example, as shown in FIG. 5B, in online portion 540, whenever a user operates application 132 to request a service, abnormal behavior recognition server 111 may perform a feature query at data storage server 112 (e.g., a cache server) to retrieve the user's historical behavior features and user attribute features (step 542). The extracted one or more current user behavior features may then be stitched to generate a user real-time feature vector (step 543), and the user real-time feature vector is stitched with the retrieved historical behavior features and user attribute features of the user (step 544), thereby forming an input for the recognition model.

Returning to fig. 3, at step 310, the abnormal behavior recognition server 111 may input a user behavior feature vector to the user abnormal behavior recognition model to recognize the user abnormal behavior. In some embodiments, the abnormal behavior recognition server 111 may recognize the user's abnormal behavior by: determining the probability that the current user behavior is abnormal; when the probability is greater than the threshold, determining the current user behavior as abnormal behavior; limiting measures are taken for users with abnormal behaviors.

For example, as shown in fig. 5B, in the online portion 540, the abnormal behavior recognition server 111 may obtain a recognition model from the recognition model training server 110, the data storage server 112, or the electronic device 130, and input the spliced user behavior feature vector into the recognition model to recognize whether the current point acquisition behavior of the user is malicious point acquisition (step 545), such as predicting a probability that the current user behavior is abnormal behavior. Then, different automatic operations can be performed for malicious integral acquisition behaviors of different probability ranges of model prediction. For example, if the user's current point pickup behavior is more than 90% of the probability of malicious point pickup behavior, the user's current point pickup may be automatically cancelled and the user (user account or user device, i.e., seal number or seal device) may be prohibited from operating the application 132 for the next 3 days.

At step 311, the recognition model training server 110 may receive at least two feedback from at least two users; and the model training module is further configured to adjust parameters of the recognition model based on at least two feedback including user complaints and user inactivity.

For example, as shown in fig. 5B, in online portion 540, abnormal behavior recognition server 111 may receive feedback of a user's malicious pickup judgment from electronic device 130, which may include positive feedback (e.g., customer service complaints, phone complaints, etc.) or negative feedback (e.g., user uninstall application 132 or inactive). For example, after receiving the user's feedback, the feedback may be accumulated as user business behavior characteristics used in step 302, and when a certain amount of feedback is accumulated, the feedback may be sent to the recognition model training server 110 to retrain the recognition model. For another example, feedback may be provided to the provider of the application 132, which may use manual inspection to determine whether the predictive outcome of the recognition model should be maintained.

By the identification method of the embodiment of the disclosure, by identifying the identification rate of the abnormal behavior of the user (such as identifying the malicious pick-up behavior of the user), the low-quality wool user can be filtered more effectively, and the activity of the high-quality user is promoted, so that the computing resource and the network resource of the provider of the application can be used for the normal user more effectively, the operation cost of the provider of the application can be reduced more effectively, the operation efficiency is improved, the retention of the product is improved, and the user experience of the normal user can be improved more effectively.

The above examples of identifying user points for malicious pickup activity are merely for understanding the present approach, and it should be understood that embodiments of the present disclosure may also include, but are not limited to, application in advertising malicious clicks and the like. That is, embodiments of the present disclosure have good reusability. Firstly, replacing the user type of the sample, such as 'advertisement malicious click user', then accumulating corresponding log data by a server, and finally producing a result by using the same method of feature splicing, feature processing and model training. In theory, only the type of the input sample needs to be adjusted to identify other types of malicious user tags of the user, such as malicious click users of advertisements, and the like.

Fig. 7 schematically illustrates a schematic diagram of a comparison of an identification method according to an embodiment of the present disclosure with an existing identification method. As shown in fig. 7, compared with the existing recognition method based on the predetermined rule and the recognition method based on the data mining, the recognition method based on the FiBiNet of the embodiment of the present disclosure can greatly improve at least two online indexes or offline indexes such as an offline AUC, an online AUC, a user complaint rate, and a user average retention rate for multiple days of the recognition model.

Specifically, from the aspect of off-line AUC effect, compared with other technical schemes, the FiBiNet scheme is averagely improved by 26.48%; from the on-line AUC effect, compared with other technical schemes, the FiBiNet scheme improves 24.89% on average; compared with other technical schemes, the FiBiNet scheme is averagely reduced by 55.82 percent from the aspect of user complaint rate; compared with other technical schemes, the FiBiNet scheme improves the average retention rate by 11.60 percent from the average retention rate of users for a plurality of days.

Fig. 8A schematically illustrates a block diagram of an apparatus 800 for training a recognition model in which the various methods described herein for training a recognition model may be implemented in accordance with an embodiment of the present disclosure. As shown in fig. 8A, an apparatus 800 for training a recognition model may include a user data management module 801, a user feature processing module 802, and a model training module 804. Furthermore, the apparatus 800 for training the recognition model may optionally further comprise a sample generation module 803 and a model evaluation module 805.

The user data management module 801 is configured to receive at least two user data of at least two users, the at least two user data including data related to user abnormal behavior. In some embodiments, the at least two user data are user data of at least two users of the application 132.

The user feature processing module 802 is configured to extract at least two user features from at least two user data. The extracted features include embedded features and non-embedded features.

The optional sample generation module 803 is configured to generate at least two samples for training the recognition model from at least two user features in a feature stitching manner.

The model training module 804 is configured to perform at least one of weighting a first user feature of the at least two user features, associating a second user feature of the at least two user features with a third user feature of the at least two user features; combining at least two user features of at least one of the weighted or associated features; and deep learning the combined at least two user features.

The optional model evaluation module 805 is configured to evaluate the trained recognition model online and/or offline based on one or more evaluation metrics; the model training module is further configured to adjust parameters of the recognition model based on the results of the online evaluation and/or the results of the offline evaluation until the results of the online evaluation and/or the results of the offline evaluation are within a preset numerical range.

Fig. 8B schematically illustrates a block diagram of an apparatus 810 for identifying user abnormal behavior in accordance with an embodiment of the present disclosure, in which apparatus 810 various methods for identifying user abnormal behavior described herein may be implemented. As shown in fig. 8B, the means 810 for identifying user abnormal behavior may include a user data management module 811, a user feature processing module 812, a user feature vector generation module 813, and a user behavior identification module 814. Furthermore, in some embodiments, the means 810 for identifying user abnormal behavior may optionally further comprise a sample generation module 803, a model training module 804, and a model evaluation module 805 as shown in fig. 8A, and the user data management module 811 and the user feature processing module 812 may be further configured to also perform the operations of the user data management module 801 and the user feature processing module 802, respectively, such that the means 810 for identifying user abnormal behavior may train the identification model, and may use the trained identification model to identify user abnormal behavior.

The user data management module 811 is configured to obtain one or more current user actions. In some embodiments, the one or more current user behaviors are user behaviors of the application 132 when the user accesses the application 132.

The user feature processing module 812 is configured to extract one or more current user behavior features from one or more current user behaviors.

The user feature vector generation module 813 is configured to: and generating a user behavior feature vector based on one or more current user behavior features by adopting a feature stitching mode.

The user behavior recognition module 814 is configured to: the user behavior feature vector is input to a user abnormal behavior recognition model to recognize the user abnormal behavior.

Although specific functions are discussed above with reference to specific modules, it should be noted that the functions of the various modules discussed herein may be separated into at least two modules and/or at least some of the functions of at least two modules may be combined into a single module. Additionally, a particular module performing an action discussed herein includes the particular module itself performing the action, or alternatively the particular module invoking or otherwise accessing another component or module performing the action (or performing the action in conjunction with the particular module). Thus, a particular module that performs an action may include the particular module itself that performs the action and/or another module that the particular module that performs the action invokes or otherwise accesses.

The various modules described above with respect to fig. 8A and 8B may be implemented in hardware or in hardware in combination with software and/or firmware. For example, the modules may be implemented as computer program code/instructions configured to be executed in one or more processors and stored in a computer-readable storage medium. Alternatively, these modules may be implemented as hardware logic/circuitry. For example, in some embodiments, one or more of user data management module 801, user feature processing module 802, sample generation module 803, model training module 804, and model evaluation module 805 may be implemented together in a system on a chip (SoC). In some embodiments, one or more of user data management module 811, user feature processing module 812, user feature vector generation module 813, and user behavior identification module 814 may be implemented together in a SoC. The SoC may include an integrated circuit chip (which includes one or more components of a processor (e.g., a Central Processing Unit (CPU), microcontroller, microprocessor, digital Signal Processor (DSP), etc.), memory, one or more communication interfaces, and/or other circuitry), and may optionally execute received program code and/or include embedded firmware to perform functions. The features of the techniques described herein are carrier-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer programs. For example, embodiments of the present disclosure provide a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing at least one step of the method embodiments of the present disclosure.

Fig. 9 schematically illustrates a block diagram of a computing device 900 according to an embodiment of the disclosure. The computing device 900 is representative of one or more of the recognition model training server 110, the abnormal behavior recognition server 111, the data storage server 112, and the electronic device 130 included in the abnormal behavior recognition system 100 of fig. 1.

Computing device 900 may be a variety of different types of devices such as a server computer, a device associated with a client (e.g., a client device), a system-on-chip, and/or any other suitable computing device or computing system.

Computing device 900 may include at least one processor 902, memory 904, at least two communication interfaces 906, a display device 908, other input/output (I/O) devices 910, and one or more mass storage 912 capable of communicating with each other, such as by a system bus 914 or other suitable means.

The processor 902 may be a single processing unit or at least two processing units, all of which may include a single or at least two computing units or at least two cores. The processor 902 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. The processor 902 may be configured to, among other capabilities, obtain and execute computer-readable instructions stored in the memory 904, mass storage 912, or other computer-readable medium, such as program code of the operating system 916, program code of the application programs 918, program code of other programs 920, etc., to implement the method 200 for training an identification model or method 210 for identifying user abnormal behavior provided by embodiments of the present disclosure.

Memory 904 and mass storage device 912 are examples of computer storage media for storing instructions that are executed by processor 902 to implement the various functions as previously described. For example, the memory 904 may generally include both volatile memory and nonvolatile memory (e.g., RAM, ROM, etc.). In addition, mass storage device 912 may generally include hard disk drives, solid state drives, removable media (including external and removable drives), memory cards, flash memory, floppy disks, optical disks (e.g., CDs, DVDs), storage arrays, network storage, storage area networks, and the like. Memory 904 and mass storage device 912 may both be referred to herein collectively as memory or computer storage medium, and may be non-transitory media capable of storing computer-readable, processor-executable program instructions as computer program code that may be executed by processor 902 as a particular machine configured to implement the operations and functions described in the examples herein.

At least two program modules may be stored on mass storage device 912. These programs include an operating system 916, one or more application programs 918, other programs 920, and program data 922, and they may be loaded into the memory 904 for execution. Examples of such application programs or program modules may include, for example, computer program logic (e.g., computer program code or instructions) for implementing the following components/functions: user data management module 801, user feature processing module 802, sample generation module 803, model training module 804, model evaluation module 805, user data management module 811, user feature processing module 812, user feature vector generation module 813, and user behavior identification module 814 and/or further embodiments described herein. In some embodiments, these program modules may be distributed across different physical locations, such as the recognition model training server 110, the abnormal behavior recognition server 111, the data storage server 112, and the electronic device 130 shown in FIG. 1, to implement corresponding functions.

Although illustrated in fig. 9 as being stored in memory 904 of computing device 900, modules 916, 918, 920, and 922, or portions thereof, may be implemented using any form of computer readable media accessible by computing device 900. As used herein, a "computer-readable medium" may include one or more types of computer-readable media, which may include, for example, computer storage media and/or communication media.

Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information for access by a computing device.

In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism. Computer storage media as defined herein do not include communication media.

Computing device 900 may also include one or more communication interfaces 906 for exchanging data with other devices, such as over a network, direct connection, etc. Communication interface 906 may facilitate communication over a variety of networks and protocol types, including wired networks (e.g., LAN, cable, etc.) and wireless networks (e.g., WLAN, cellular, satellite, etc.), the Internet, and so forth. The communication interface 906 may also provide for communication with external storage devices (not shown) such as in a storage array, network storage, storage area network, or the like.

In some examples, a display device 908, such as a display, may be included for displaying information and images. Other I/O devices 910 may be devices that receive various inputs from a user and provide various outputs to the user, and may include touch input devices, gesture input devices, cameras, keyboards, remote controls, mice, printers, audio input/output devices, and so on.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the system, apparatus and module described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein. In the application, the relevant data collection and processing should be strictly according to the requirements of relevant national laws and regulations when in use, obtain the informed consent or independent consent of the personal information body, or have the necessary legal basis, and develop the subsequent data use and processing behaviors within the authorized range of laws and regulations and the personal information body.

Variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "a and/or B" means A, B, or a and B, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality; the terms "first," "second," "third," "fourth" are used merely to distinguish one element or step from another and do not denote a sequence of elements or steps. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A method for identifying user abnormal behavior, comprising:

acquiring one or more current user behaviors, wherein the one or more current user behaviors are user behaviors of a user when accessing an application;

extracting one or more current user behavior features from the one or more current user behaviors;

retrieving historical behavior characteristics and user attribute characteristics of the user;

splicing the extracted one or more current user behavior characteristics to generate a user real-time characteristic vector;

splicing the user real-time feature vector with the retrieved historical behavior feature and the user attribute feature of the user to obtain the user behavior feature vector;

inputting the user behavior feature vector into a user abnormal behavior recognition model to recognize the user abnormal behavior;

the abnormal behavior recognition model of the user is obtained through the following steps:

receiving at least two user data of at least two users of the application, the at least two user data comprising data relating to abnormal behavior of the users during use of the application;

extracting at least two user features from the at least two user data;

Weighting a first user feature of the at least two user features and associating a second user feature of the at least two user features with a third user feature of the at least two user features;

combining the weighted and associated at least two user features; and

deep learning the combined at least two user features to obtain a model for identifying whether the input at least two features contain abnormal behavior features of the user;

wherein extracting at least two user features from the at least two user data comprises:

extracting one or more user attribute features from data of the at least two user data related to user attributes of a first user of the application;

extracting one or more user business behavior features from data of the at least two user data related to a behavior of the first user accessing the application;

extracting one or more user non-business behavior features from data of the at least two user data related to a behavior of the first user other than accessing the application; and

Performing feature processing on the one or more extracted user attribute features, the one or more user business behavior features and the one or more user non-business behavior features to increase feature density;

the extracted features include embedded features and non-embedded features.

2. The method of claim 1, wherein the method further comprises:

generating at least two samples for training the recognition model from the at least two user features, specifically comprising:

splicing the one or more user attribute features of the first user;

splicing the one or more user business behavior characteristics of the first user;

splicing the one or more user non-business behavior characteristics of the first user;

and splicing the spliced user attribute characteristics, the spliced user business behavior characteristics and the spliced user non-business behavior characteristics to form sample data of the first user.

3. The method of claim 1, wherein weighting a first user characteristic of the at least two user characteristics comprises:

compressing a set of user features of the at least two user features using a pooling operation to obtain a statistical vector for the set of user features;

Obtaining weight vectors for the set of user features by machine learning based on the statistical vectors;

weighting at least the first user feature of the set of user features based on the weight vector, the weighting comprising: multiplying the set of user features with the weight vector.

4. The method of claim 1, wherein associating the second user characteristic with the third user characteristic specifically comprises:

associating the second user feature with the third user feature using bilinear manipulation, the bilinear manipulation comprising a first linear manipulation and a second linear manipulation that are different from each other, the first linear manipulation being an inner product calculation, the second linear manipulation being a hadamard product calculation,

using bilinear manipulation to associate the second user characteristic with the third user characteristic specifically includes:

calculating an inner product of the second user feature and a cross matrix by the first linear operation; and

the Hadamard product of the inner product and the third user characteristic is calculated by the second linear function.

5. The method according to any one of claims 1-4, characterized in that the method comprises in particular:

Performing online and/or offline evaluation of the trained recognition model based on one or more evaluation metrics; and

adjusting parameters of the recognition model based on the evaluation result,

the adjustment specifically comprises: and adjusting parameters of the identification model based on the online evaluation result and/or the offline evaluation result until the online evaluation result and/or the offline evaluation result is in a preset numerical range.

6. The method according to any one of claims 1-4, characterized in that the method comprises in particular:

receiving at least two feedback from at least two users; and

adjusting parameters of the recognition model based on the at least two feedback,

the at least two feedback includes user complaints and user inactivity.

7. The method of any one of claim 1 to 4,

the recognition model includes a depth portion and a non-depth portion, weighting a first user feature of the at least two user features being performed in the non-depth portion, associating the second user feature with the third user feature being performed in the non-depth portion;

The first user characteristic and the second user characteristic are the same or different user characteristics;

the first user characteristic and the third user characteristic are the same or different user characteristics;

the second user characteristic is a different user characteristic than the third user characteristic;

the recognition model includes a FibiNet model.

8. The method of claim 1, wherein identifying user abnormal behavior specifically comprises:

determining a probability that the current user behavior is an abnormal behavior;

when the probability is greater than a threshold, determining the current user behavior as an abnormal behavior;

and taking limiting measures for the users with abnormal behaviors.

9. An apparatus for identifying abnormal behavior of a user, comprising:

a user data management module configured to obtain one or more current user behaviors, the one or more current user behaviors being user behaviors of a user when accessing an application;

a user feature processing module configured to extract one or more current user behavior features from the one or more current user behaviors;

a user feature vector generation module configured to retrieve historical behavioral characteristics and user attribute characteristics of the user; splicing the extracted one or more current user behavior characteristics to generate a user real-time characteristic vector; splicing the user real-time feature vector with the retrieved historical behavior feature and the user attribute feature of the user to obtain the user behavior feature vector; and

A user behavior recognition module configured to input the user behavior feature vector to a user abnormal behavior recognition model to recognize a user abnormal behavior;

wherein the user abnormal behavior recognition model is trained by a device for training the recognition model, and the device for training the recognition model comprises:

a user data management module configured to receive at least two user data of at least two users of the application, the at least two user data including data relating to user abnormal behavior during use of the application;

a user feature processing module configured to extract at least two user features from the at least two user data;

a model training module configured to perform at least one of weighting a first user feature of the at least two user features or associating a second user feature of the at least two user features with a third user feature of the at least two user features; combining the at least two user features of at least one of weighted or associated; and performing deep learning on the combined at least two user features to obtain a model for identifying whether the input at least two features contain abnormal behavior features of the user;

Wherein the at least two user data are user data of at least two users of the application, and the user feature processing module is further configured to:

the extracted features include embedded features and non-embedded features.

10. A computing device, comprising:

a processor; and

a memory having instructions stored thereon, which when executed on the processor cause the processor to perform the method of any of claims 1-8.

11. One or more computer-readable storage media having instructions stored thereon, which when executed on one or more processors cause the one or more processors to perform the method of any of claims 1-8.