CN114882587A

CN114882587A - Method, apparatus, electronic device, and medium for generating countermeasure sample

Info

Publication number: CN114882587A
Application number: CN202210450101.3A
Authority: CN
Inventors: 刁云峰; 谭资昌; 郭国栋
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-08-09

Abstract

The disclosure provides a method, a device, electronic equipment and a medium for generating a confrontation sample, and relates to the technical field of artificial intelligence, in particular to the technical fields of behavior recognition, confrontation defense training and the like. The implementation scheme is as follows: the method comprises the steps of obtaining a first human body skeleton formed by connecting a plurality of joint nodes and corresponding skeleton data, wherein the skeleton data comprise a first rectangular coordinate representation of the first human body skeleton in a preset rectangular coordinate system; determining a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation; constructing a spherical coordinate system based on the first human body skeleton; determining a corresponding spherical coordinate value of each joint node in the rest joint nodes except the root node in the plurality of joint nodes in a spherical coordinate system; and applying an interference value to the angle of the bone in the first human skeleton, and determining a countermeasure sample based on the applied interference value and the corresponding spherical coordinate value of each of the remaining joint nodes in the spherical coordinate system.

Description

Method, apparatus, electronic device, and medium for generating countermeasure sample

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of behavior recognition and confrontational defense training, and in particular, to a method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product for generating a confrontational sample.

Background

Artificial intelligence is the subject of research that causes computers to simulate certain human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

Classification is a basic task of machine learning, and the current methods based on deep learning have achieved the most advanced performance in many tasks, such as object recognition and segmentation. However, deep learning models are very vulnerable to human well-designed input perturbations, i.e., to combat attacks. The popularity of this deep learning vulnerability has attracted a great deal of attention because humans cannot detect this perturbation, but are extremely disruptive to today's artificial intelligence systems. Therefore, designing an anti-defense method that is more effective against anti-attacks has become an emerging and important area of research in recent years.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product for generating a confrontational sample.

According to an aspect of the present disclosure, there is provided a method of generating a challenge sample, comprising: the method comprises the steps of obtaining a first human body skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human body skeleton, wherein the skeleton data comprise a first rectangular coordinate representation of the first human body skeleton in a preset rectangular coordinate system; determining a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation; constructing a spherical coordinate system based on the first human body skeleton; determining a corresponding spherical coordinate value of each joint node in the rest joint nodes except the root node in the plurality of joint nodes in the spherical coordinate system; and applying an interference value to an angle of a bone in the first human skeleton, and determining a countermeasure sample based on the applied interference value, a spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and a rectangular coordinate value of the root node.

According to another aspect of the present disclosure, there is provided a confrontation defense training method of a framework behavior recognition model, including: acquiring sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data and non-manifold countermeasure sample data outside a manifold corresponding to the data set from a data set, wherein the manifold countermeasure sample data is obtained based on a method for generating countermeasure samples; marking real classification values of the sample data, the manifold countermeasure sample data and the non-manifold countermeasure sample data, wherein the real classification values represent behavior classes corresponding to the human skeleton; inputting the sample data into a skeleton behavior recognition model, and acquiring a first prediction classification value of the sample data; inputting the manifold countermeasure sample data into the skeleton behavior identification model, and acquiring a second prediction classification value of the manifold countermeasure sample data; inputting the non-manifold countermeasure sample data into the skeleton behavior identification model, and acquiring a third prediction classification value of the non-manifold countermeasure sample data; calculating a loss value based on the true classification value, the first predicted classification value, the second predicted classification value, and the third predicted classification value; and adjusting parameters of the skeletal behavior recognition model based on the loss values.

According to another aspect of the present disclosure, there is provided an apparatus for generating a challenge sample, comprising: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is configured to acquire a first human skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human skeleton, and the skeleton data comprises a first rectangular coordinate representation of the first human skeleton in a preset rectangular coordinate system; a first determination module configured to determine a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation; a construction module configured to construct a spherical coordinate system based on the first human skeleton; a second determining module configured to determine a spherical coordinate value corresponding to each of the rest of the plurality of joint nodes except the root node in the spherical coordinate system; and a third determination module configured to apply an interference value to an angle of a bone in the first human skeleton, and determine a countermeasure sample based on the applied interference value, a spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and a rectangular coordinate value of the root node.

According to another aspect of the present disclosure, there is provided a confrontation defense training apparatus of a framework behavior recognition model, including: the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is configured to acquire sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data and non-manifold countermeasure sample data located outside a manifold corresponding to the data set from a data set, and the manifold countermeasure sample data is obtained based on a method for generating countermeasure samples; a marking module configured to mark a real classification value of the sample data for the sample data, the manifold countermeasure sample data and the non-manifold countermeasure sample data, wherein the real classification value characterizes a behavior category corresponding to the human skeleton; the second acquisition module is configured to input the sample data into a skeleton behavior recognition model and acquire a first prediction classification value of the sample data; a third obtaining module, configured to input the manifold immunity sample data into the skeleton behavior recognition model, and obtain a second prediction classification value of the manifold immunity sample data; a fourth obtaining module configured to input the non-manifold countermeasure sample data into the skeleton behavior recognition model, and obtain a third predicted classification value of the non-manifold countermeasure sample data; a calculation module configured to calculate a loss value based on the true classification value, the first predicted classification value, the second predicted classification value, and the third predicted classification value; and an adjustment module configured to adjust parameters of the skeletal behavior recognition model based on the loss values.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the aforementioned method.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the aforementioned method.

According to another aspect of the disclosure, a computer program product is provided, comprising a computer program, wherein the computer program realizes the aforementioned method when executed by a processor.

According to one or more embodiments of the present disclosure, a method for generating a countermeasure sample is provided, in which a spherical coordinate system is established based on a human skeleton to decouple constraints into individual angular coordinates, so that interference can be added only in an angular direction, while bone length is kept constant by keeping a radius of the spherical coordinate constant, thereby satisfying constraints of human motion characteristics, and generating a countermeasure sample capable of simulating an anti-skeleton action for an attack behavior recognition model.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, according to an embodiment of the present disclosure;

FIG. 2 shows a flow diagram of a method of generating a challenge sample according to an embodiment of the present disclosure;

FIG. 3 shows a schematic view of a human skeleton according to an embodiment of the disclosure;

FIG. 4 shows a flow diagram of a method of determining a challenge sample according to an embodiment of the present disclosure;

FIG. 5 illustrates a flow diagram of a method of confrontational defense training of a skeletal behavior recognition model in accordance with an embodiment of the present disclosure;

FIG. 6 shows a block diagram of an apparatus for generating countermeasure samples in accordance with an embodiment of the disclosure;

FIG. 7 shows a block diagram of a third determination module, according to an embodiment of the present disclosure;

FIG. 8 illustrates a block diagram of a confrontational defense training apparatus of a skeletal behavior recognition model in accordance with an embodiment of the present disclosure; and

FIG. 9 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", and the like to describe various elements is not intended to limit the positional relationship, the temporal relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

In the related art, the attack method for the behavior recognition model cannot strictly constrain the bone length and cannot delicately integrate the constraint on the joint angle into the attack, so that the generated countermeasure sample does not conform to the kinematics of the human body.

In order to solve the above problems, the present disclosure provides a method for generating a countermeasure sample for countermeasure defense training of a behavior recognition model, in which a spherical coordinate system is established based on a human skeleton to decouple constraints into individual angular coordinates, so that interference can be added only in an angular direction, and a bone length is kept constant by keeping a radius of the spherical coordinate constant, thereby satisfying constraint conditions of human motion characteristics, and generating a countermeasure sample capable of simulating an action of the countermeasure skeleton for an attack behavior recognition model.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented in accordance with embodiments of the present disclosure. Referring to fig. 1, the system 100 includes one or

more client devices

101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120.

Client devices

101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In an embodiment of the disclosure, the server 120 may run one or more services or software applications that enable any of the foregoing methods to be performed.

In some embodiments, the server 120 may also provide other services or software applications that may include non-virtual environments and virtual environments. In certain embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of

client devices

101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof, which may be executed by one or more processors. A user operating a

client device

101, 102, 103, 104, 105, and/or 106 may, in turn, utilize one or more client applications to interact with the server 120 to take advantage of the services provided by these components. It should be understood that a variety of different system configurations are possible, which may differ from system 100. Accordingly, fig. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.

A user may perform any of the foregoing methods using

client devices

101, 102, 103, 104, 105, and/or 106. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that any number of client devices may be supported by the present disclosure.

Client devices

101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptops), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and so forth. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, UNIX-like operating systems, Linux, or Linux-like operating systems (e.g., GOOGLE Chrome OS); or include various Mobile operating systems such as MICROSOFT Windows Mobile OS, iOS, Windows Phone, Android. Portable handheld devices may include cellular telephones, smart phones, tablets, Personal Digital Assistants (PDAs), and the like. Wearable devices may include head-mounted displays (such as smart glasses) and other devices. The gaming system may include a variety of handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), Short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a variety of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. By way of example only, one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture involving virtualization (e.g., one or more flexible pools of logical storage that may be virtualized to maintain virtual storage for the server). In various embodiments, the server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above, as well as any commercially available server operating systems. The server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, and the like.

In some implementations, the server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of the

client devices

101, 102, 103, 104, 105, and 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of

client devices

101, 102, 103, 104, 105, and 106.

In some embodiments, the server 120 may be a server of a distributed system, or a server incorporating a blockchain. The server 120 may also be a cloud server, or a smart cloud computing server or a smart cloud host with artificial intelligence technology. The cloud Server is a host product in a cloud computing service system, and is used for solving the defects of high management difficulty and weak service expansibility in the traditional physical host and Virtual Private Server (VPS) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of the databases 130 may be used to store information such as audio files and video files. The database 130 may reside in various locations. For example, the database used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The database 130 may be of different types. In certain embodiments, the database used by the server 120 may be, for example, a relational database. One or more of these databases may store, update, and retrieve data to and from the database in response to the command.

In some embodiments, one or more of the databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key-value stores, object stores, or regular stores supported by a file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

Fig. 2 shows a flow diagram of a method of generating a challenge sample according to an embodiment of the present disclosure. As shown in fig. 2, a method 200 of generating a challenge sample includes: step S201, obtaining a first human body skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human body skeleton, wherein the skeleton data comprises a first rectangular coordinate representation of the first human body skeleton in a preset rectangular coordinate system; step S202, determining rectangular coordinate values of root nodes in the plurality of joint nodes based on the first rectangular coordinate representation; s203, constructing a spherical coordinate system based on the first human skeleton; step S204, determining a corresponding spherical coordinate value of each joint node in the rest joint nodes except the root node in the plurality of joint nodes in the spherical coordinate system; and step S205, applying an interference value to the angle of the bone in the first human skeleton, and determining a countermeasure sample based on the applied interference value, the corresponding spherical coordinate value of each of the rest joint nodes in the spherical coordinate system, and the rectangular coordinate value of the root node.

Therefore, a method for generating a countermeasure sample for countermeasure defense training of a behavior recognition model is provided, a spherical coordinate system is established based on a human body skeleton to decouple constraints into individual angle coordinates, so that interference can be added to human body skeleton data only in the angle direction, and the length of bones in the human body skeleton is kept unchanged by keeping the radius of the spherical coordinates unchanged, so that constraint conditions of human body motion characteristics are met, and the countermeasure sample capable of simulating actions of the countermeasure skeleton is generated and used for an attack behavior recognition model. Performing countermeasure defense training on the behavior recognition model based on the resultant countermeasure samples helps to improve the recognition accuracy and countermeasure robustness of the behavior recognition model.

According to some embodiments, the root node is a joint node of a bottommost spine in the human skeleton. Thus, the human skeleton is obtained by connecting the joint nodes in the trunk direction and the head direction of the human body. The joint node at the bottommost part of the spine is usually located near the gravity center of the human body, and the behavior and the action of the human body are recognized by taking the joint node as a root node and taking the joint node as a reference point, so that a better recognition effect can be obtained.

Fig. 3 shows a schematic view of a human skeleton according to an embodiment of the disclosure. As shown in FIG. 3, the human skeleton has 25 joint nodes, which are numbered 1-25, respectively, and the line segment connecting the two joint nodes represents the bone between the two joints, wherein the joint node 1 is located at the bottom of the spine and is the root node in the human skeleton.

For convenience of description and understanding, the technical proposal of the present disclosure will be specifically described in the following by taking the human skeleton in fig. 3 as an example. It should be noted that the technical solution of the present disclosure is not limited to this specific human skeleton model with 25 joint nodes and 24 bones, but is also applicable to human skeleton models with other numbers of joints, bones and other styles.

According to some embodiments, step S203 comprises: and constructing a spherical coordinate system corresponding to the rest joint nodes one to form the spherical coordinate system based on the rest joint nodes in the first human body skeleton, wherein the spherical coordinate system corresponding to each of the rest joint nodes takes a first joint node adjacent to the joint node as a coordinate origin, and the first joint node is closer to the root node than the other joint nodes adjacent to the joint node.

It will be appreciated that the distance to the root node may be determined by the number of bones included in the joint node to root node path, with a greater number of bones included in the joint node to root node path indicating a greater distance of the joint node to the root node. Therefore, for each of the rest joint nodes, the joint node located before the point on the path from the point to the root node is used as the origin of coordinates, and a spherical coordinate system corresponding to the point is established. For example, in fig. 3, the spherical coordinate system corresponding to the joint node 2 uses the root node 1 as the origin of coordinates, and the spherical coordinate system corresponding to the joint node 21 uses the joint node 2 as the origin of coordinates. Thus, by the above-described construction method of the spherical coordinate system, the spherical coordinate of each of the remaining joint nodes is made to represent the distance from the point to the joint node located before the point (i.e., the bone length between the two joints), and the angle between the two joint nodes, whereby the angle between the joint nodes can be changed by changing the angle value in the spherical coordinate value of each of the remaining joint nodes.

Taking fig. 3 as an example, based on the constructed spherical coordinate system, the spherical coordinate value of the joint node i can be expressed as

Wherein i is an integer of 25. gtoreq.i>1. When the first rectangular coordinate of the root node 1 is expressed as (x) ₁ ,y ₁ ,z ₁ ) The first human skeleton shown in FIG. 3 may be represented as

Fig. 4 shows a flow diagram of a method of determining a challenge sample according to an embodiment of the disclosure. As shown in fig. 4, the step S205 for determining the confrontation sample includes: step S401, carrying out iteration updating on the ball coordinate value of each of the other joint nodes for a preset number of times, wherein in each iteration, a first interference value is added to a first angle value corresponding to a first angle variable theta in the current ball coordinate value of each of the other joint nodes, a second interference value is added to a second angle value corresponding to a second angle variable phi, and the ball coordinate representation of a second human body skeleton corresponding to the iteration is determined based on the rectangular coordinate value of the root node; step S402, aiming at each iteration, calculating a perception loss value and a classification loss value corresponding to a second human body skeleton based on the first rectangular coordinate representation and the spherical coordinate representation of the second human body skeleton corresponding to the iteration; and step S403, determining a second human skeleton with the induction loss value smaller than the first threshold value and the classification loss value larger than the second threshold value as a confrontation sample.

Therefore, the spherical coordinate value of each joint node is updated through iteration of the angle value in the spherical coordinate values of each joint node in the rest joint nodes, and the second human body skeleton is obtained. The classification loss value is utilized to ensure that the constructed confrontation sample can deceive the behavior recognition model, and the perceptive loss value is utilized to ensure that the constructed confrontation sample and the original sample are not perceptive in vision, so that the high-quality confrontation sample with more accurate aggressivity is constructed, and the accuracy and the confrontation robustness of the model are improved.

According to some embodiments, step S401 comprises: in each iteration, aiming at each joint node in the rest joint nodes, increasing the first interference value for the first angle value in the current spherical coordinate value of the joint node to obtain an updated first angle value; adding the second interference value to a second angle value in the current spherical coordinate values of the joint node to obtain an updated second angle value; determining an angle constraint range based on the kinematics characteristics of the human body; respectively carrying out mapping calculation on the updated first angle value and the updated second angle value based on the angle constraint range to obtain an updated spherical coordinate value corresponding to the joint node; and determining the spherical coordinate representation of the second human body skeleton corresponding to the iteration based on the updated spherical coordinate value corresponding to each joint node in the rest joint nodes and the rectangular coordinate value of the root node.

It is understood that the articulation angle of the human body should be maintained within a certain range, i.e., with a limited range of angles, according to the kinematics of the human body. In the process of updating the first angle value and the second angle value in the spherical coordinate values of the joint nodes by increasing interference, the updated first angle value and/or the updated second angle value exceeds an angle constraint range, which obviously does not accord with the characteristics of the kinematics of the human body.

Illustratively, before the t-th iteration (t being a positive integer), the spherical coordinates of the human skeleton may be expressed as

After the t-th iteration is completed, the spherical coordinates of the second human skeleton corresponding to the iteration can be expressed as

According to some embodiments, step S402 comprises: for each iteration, determining a second rectangular coordinate representation of a second human body skeleton in the preset rectangular coordinate system based on the spherical coordinate representation of the second human body skeleton corresponding to the iteration; and calculating a perception loss value and a classification loss value corresponding to the second human body skeleton based on the first rectangular coordinate representation and the second rectangular coordinate representation of the second human body skeleton.

It will be appreciated that the spherical coordinate representation and the rectangular coordinate representation may be interconverted, i.e. the spherical coordinate representation of the second human skeleton

Can be converted into a corresponding second rectangular coordinate representation

Wherein i isAn integer number. And calculating a perception loss value and a classification loss value corresponding to the second human body skeleton based on the first rectangular coordinate representation and the second rectangular coordinate representation of the second human body skeleton.

According to some embodiments, in each iteration, the value of the perceptual loss of the second human skeleton corresponding to the iteration is based on a perceptual loss function

Calculated, wherein n is an integer,

x represents the first cartesian representation and x' represents a second cartesian representation of the second human skeleton.

Many existing image and video processing approaches achieve imperceptibility by computing minimal changes to the image or video frames, however, such approaches are not suitable for motion because they do not take into account kinematic characteristics. When the motion imperceptibility is involved, the naturalness of the perceived motion is important, the perception loss function can effectively optimize the kinematics characteristics of the counterskeletal motion sample and the original skeletal motion sample, when n is 2, the influence of the position, the speed and the acceleration of the joint on the recognition effect is considered, and when alpha is 2 ₀ ＝0.6，α ₂ When the weight is 0.4, the optimal effect can be obtained by such weight assignment. For skeletal motion data, even small disturbances are applied, which can be easily observed.

According to some embodiments, the method further comprises: in each iteration, for each joint node in the rest joint nodes, calculating a first partial derivative of the classification loss function to a first angle variable theta of the joint node based on the current spherical coordinate value of the joint node, and determining the first interference value based on the first partial derivative; and calculating a second partial derivative of the classification loss function to a second angle variable phi of the joint node based on the current spherical coordinate value of the joint node, and determining the second interference value based on the second partial derivative.

Expressing the classification loss function as L _c When, aiming at joint nodes

Its corresponding origin of coordinates is represented as (x) _oi ,y _oi ,z _oi ) Then, the first partial derivative and the second partial derivative corresponding to the joint node may be respectively expressed as:

the two calculation formulas can be derived based on a transformation formula of rectangular coordinates and spherical coordinates and a chain rule, and the derivation process is not described in detail in this disclosure. Thus, a first interference value may be determined based on the first partial derivative and a second interference value may be determined based on the second partial derivative. In one example, the first interference partial derivative may be calculated by using a sign function and multiplied by a preset interference step length to obtain a first interference value; accordingly, the second partial derivative may be calculated by using a sign function and multiplied by a preset interference step to obtain a second interference value. Therefore, interference on joint nodes is achieved, and a second human body skeleton is obtained.

According to another aspect of the present disclosure, a confrontational defense training method of a framework behavior recognition model is provided. As shown in fig. 5, the confrontational defense training method 500 of the skeletal behavior recognition model includes: step S501, obtaining sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data and non-manifold countermeasure sample data outside a manifold corresponding to the data set from a data set, wherein the manifold countermeasure sample data is obtained based on the method for generating countermeasure samples; step S502, marking real classification values of the sample data for the sample data, the manifold countermeasure sample data and the non-manifold countermeasure sample data, wherein the real classification values represent behavior classes corresponding to the human skeleton; step S503, inputting the sample data into a skeleton behavior recognition model, and acquiring a first prediction classification value of the sample data; step S504, inputting the manifold countermeasure sample data into the skeleton behavior recognition model, and obtaining a second prediction classification value of the manifold countermeasure sample data; step S505, inputting the non-manifold countermeasure sample data into the skeleton behavior identification model, and acquiring a third prediction classification value of the non-manifold countermeasure sample data; step S506, calculating a loss value based on the real classification value, the first prediction classification value, the second prediction classification value and the third prediction classification value; and step S507, adjusting parameters of the skeleton behavior recognition model based on the loss value.

In the related art, it is common to assume that the countermeasure disturbance is distributed in a simple norm sphere space, and characterize the distribution of the countermeasure samples by using a priori defined distributions, such as gaussian distributions or sampling from a specific attack mode, so as to generate the countermeasure samples for the countermeasure training. However, the inventors have found that this assumption will over-simplify the characterization of the distribution of the challenge sample, whereas the distribution of the true challenge sample can be arbitrarily complex. When a perturbation is sampled from a prior using such an assumption, a model trained using conservative prior knowledge (e.g., a gaussian distribution with a small variance) is generally not robust against attack, while using more aggressive prior knowledge (e.g., a gaussian distribution with a large variance) generally compromises the accuracy of the trained model.

According to the confrontation defense training method of the framework behavior recognition model, the manifold confrontation sample data which is distributed the same as or similar to the sample data of the human skeleton and the non-manifold confrontation sample data which is positioned outside the manifold corresponding to the data set are introduced, and the framework behavior recognition model is trained together with the sample data of the human skeleton, so that the manifold confrontation sample data is used for improving the precision and the confrontation robustness of the model at the same time, the non-manifold confrontation sample data is more aggressive, and the framework behavior recognition model is helped to resist stronger attack.

According to some embodiments, the penalty values comprise a first sub-penalty value, a second sub-penalty value and a third sub-penalty value, and step S506 further comprises: calculating the first sub-loss value based on the true classification value and the first predicted classification value; calculating the second sub-loss value based on the true classification value and the second predicted classification value; calculating the third sub-loss value based on the true classification value and the third predicted classification value; and calculating the loss value based on the first sub-loss value and the corresponding first weight, the second sub-loss value and the corresponding second weight, and the third sub-loss value and the corresponding third weight.

It can be understood that, for three types of sample data, namely, sample data, manifold robust sample data, and non-manifold robust sample data, the sample data can be trained respectively by using corresponding standard classification loss function, manifold robust loss function, and non-manifold robust loss function, and the loss value calculation and parameter adjustment can be performed. The finally trained model has higher precision and robustness, and can resist stronger attack.

According to another aspect of the present disclosure, an apparatus for generating a challenge sample is provided. As shown in fig. 6, the apparatus 600 for generating a challenge sample includes: an obtaining module 601, configured to obtain a first human skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human skeleton, where the skeleton data includes a first rectangular coordinate representation of the first human skeleton in a preset rectangular coordinate system; a first determination module 602 configured to determine a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation; a construction module 603 configured to construct a spherical coordinate system based on the first human skeleton; a second determining module 604 configured to determine a spherical coordinate value corresponding to each of the rest of the plurality of joint nodes except the root node in the spherical coordinate system; and a third determining module 605 configured to apply an interference value to an angle of a bone in the first human skeleton, and determine a countermeasure sample based on the applied interference value, a spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and a rectangular coordinate value of the root node.

Therefore, a device for generating a countermeasure sample for countermeasure training of a behavior recognition model is provided, a building module 603 builds a spherical coordinate system based on a human skeleton to decouple constraints into individual angle coordinates, so that a third determining module 605 can only increase interference on the human skeleton data in the angle direction, and keep bone length in the human skeleton unchanged by keeping the radius of the spherical coordinates unchanged, thereby satisfying the constraint condition of human motion characteristics, and generating the countermeasure sample capable of simulating the actions of the countermeasure skeleton for an attack behavior recognition model. Performing countermeasure defense training on the behavior recognition model based on the resultant countermeasure samples helps to improve the recognition accuracy and countermeasure robustness of the behavior recognition model.

The operations of the module 601-605 of the apparatus 600 for generating a countermeasure sample are similar to the operations of the steps S201-S205 described above, and are not repeated herein.

According to some embodiments, the root node is a joint node of a bottommost spine in the human skeleton. Thus, the human skeleton is obtained by connecting the joint nodes in the trunk direction and the head direction of the human body from the joint node. The joint node at the bottommost part of the spine is usually located near the gravity center of the human body, and the behavior and the action of the human body are recognized by taking the joint node as a root node and taking the joint node as a reference point, so that a better recognition effect can be obtained.

According to some embodiments, the building module 603 is further configured to: and constructing a spherical coordinate system corresponding to the rest joint nodes one to form the spherical coordinate system based on the rest joint nodes in the first human body skeleton, wherein the spherical coordinate system corresponding to each of the rest joint nodes takes a first joint node adjacent to the joint node as a coordinate origin, and the first joint node is closer to the root node than the other joint nodes adjacent to the joint node.

It will be appreciated that the distance to the root node may be determined by the number of bones included in the joint node to root node path, with a greater number of bones included in the joint node to root node path indicating a greater distance of the joint node to the root node. Therefore, for each of the rest joint nodes, the joint node located before the point on the path from the point to the root node is used as the origin of coordinates, and a spherical coordinate system corresponding to the point is established. Thus, based on the spherical coordinate system constructed by the construction module 603, the spherical coordinate of each of the remaining joint nodes represents a distance from the point to the joint node before the point (i.e., a bone length between the two joints), and an angle between the two joint nodes, and thus the third determination module 605 may change the angle between the joint nodes by changing an angle value in the spherical coordinate value of each of the remaining joint nodes.

Fig. 7 shows a block diagram of a third determination module according to an embodiment of the present disclosure. As shown in fig. 7, the third determination module 605 includes: an iteration unit 701 configured to perform iteration update of a preset number of times on the spherical coordinate value of each of the other joint nodes, where in each iteration, a first interference value is added to a first angle value corresponding to a first angle variable θ and a second interference value is added to a second angle value corresponding to a second angle variable Φ in the current spherical coordinate value of each of the other joint nodes, and a spherical coordinate representation of a second human skeleton corresponding to the iteration is determined based on the rectangular coordinate value of the root node; a calculating unit 702 configured to calculate, for each iteration, a perception loss value and a classification loss value corresponding to a second human skeleton based on the rectangular coordinate value of the root node and a spherical coordinate representation of the second human skeleton corresponding to the iteration; and a determining unit 703 configured to determine a second human skeleton, of which the sensing loss value is smaller than the first threshold and the classification loss value is larger than the second threshold, as a confrontation sample.

Therefore, the ball coordinate value of each joint node is updated through iteration of the iteration unit 701 on the angle value in the ball coordinate value of each joint node in the rest joint nodes, and a second human body skeleton is obtained. The determining unit 703 ensures that the constructed countermeasure sample can deceive the behavior recognition model by using the classification loss value, and also ensures that the constructed countermeasure sample is visually imperceptible to the original sample by using the perception loss value, so as to construct a high-quality countermeasure sample with more accurate aggressivity, which is helpful for improving model accuracy and countermeasure robustness.

According to some embodiments, the iteration unit 701 comprises: a first adding subunit configured to, in each iteration, add, for each of the remaining joint nodes, the first interference value to a first angle value in the current spherical coordinate values of that joint node to obtain an updated first angle value; a second increasing subunit, configured to increase the second interference value for a second angle value in the current spherical coordinate value of the joint node to obtain an updated second angle value; a second determination subunit configured to determine an angle constraint range based on a kinematic characteristic of the human body; the second calculating subunit is configured to perform mapping calculation on the updated first angle value and the updated second angle value respectively based on the angle constraint range to obtain an updated spherical coordinate value corresponding to the joint node; a third determining subunit configured to determine a spherical coordinate representation of the second human skeleton corresponding to the iteration based on the updated spherical coordinate value corresponding to each of the remaining joint nodes and the rectangular coordinate value of the root node.

It is understood that the articulation angle of the human body should be maintained within a certain range, i.e., with a limited range of angles, according to the kinematics of the human body. In the process that the iteration unit 701 updates the first angle value and the second angle value in the spherical coordinate values of the joint nodes by increasing interference, there is a case that the updated first angle value and/or the updated second angle value exceeds an angle constraint range, which obviously does not conform to the characteristics of the kinematics of the human body.

According to some embodiments, the calculation unit 702 comprises: a first determining subunit configured to determine, for each iteration, a second rectangular coordinate representation of a second human body skeleton in the preset rectangular coordinate system based on the rectangular coordinate value of the root node and a spherical coordinate representation of the second human body skeleton corresponding to the iteration; and a first calculating subunit, configured to calculate a perception loss value and a classification loss value corresponding to the second human skeleton based on the first rectangular coordinate representation and a second rectangular coordinate representation of the second human skeleton.

Wherein i is an integer. The first calculating subunit calculates a perception loss value and a classification loss value corresponding to the second human skeleton based on the first rectangular coordinate representation and the second rectangular coordinate representation of the second human skeleton.

According to some embodiments, the calculation unit 702 is further configured to, in each iteration, based on a perceptual loss function

Calculating a perception loss value of a second human skeleton corresponding to the iteration, wherein n is an integer,

Many existing image and video processing approaches achieve imperceptibility by computing minimal changes to the image or video frames, however, such approaches are not suitable for motion because they do not take into account kinematic characteristics. When the motion imperceptibility is involved, the nature of the perceived motion is important, the above-mentioned perception loss function can effectively optimize the kinematics characteristics of the antagonistic skeleton motion sample and the original skeleton motion sample, when n is 2, the influence of the position, speed and acceleration of the joint on the recognition effect is considered, and when alpha is 2 ₀ ＝0.6，α ₂ When the weight is 0.4, the optimal effect can be obtained by such weight assignment. For skeletal motion data, even small disturbances are applied, which can be easily observed.

According to some embodiments, the apparatus for generating a challenge sample further comprises: a fourth determination module configured to, for each of the remaining joint nodes, calculate, based on a current spherical coordinate value of the joint node, a first partial derivative of the classification loss function with respect to a first angular variable θ of the joint node, and determine the first interference value based on the first partial derivative, in each iteration; and a fifth determination module configured to calculate a second partial derivative of the classification loss function for a second angle variable phi of the joint node based on the current spherical coordinate value of the joint node, and determine the second interference value based on the second partial derivative.

Expressing the classification loss function as L _c When, aiming at joint nodes

Its corresponding origin of coordinates is represented as (x) _oi ,y _oi ,z _oi ) When, the joint nodeThe corresponding first and second partial derivatives may be respectively expressed as:

the two calculation formulas can be derived based on a transformation formula of rectangular coordinates and spherical coordinates and a chain rule, and the derivation process is not described in detail in this disclosure. Thus, the fourth determination module may determine the first interference value based on the first partial derivative and the second interference value based on the second partial derivative. In one example, the fourth determining module may calculate the first interference offset by using a sign function and perform a product calculation with a preset interference step to obtain a first interference value; accordingly, the fourth determining module may calculate the second partial derivative by using a sign function and perform a product calculation with a preset interference step to obtain a second interference value. Therefore, interference on joint nodes is achieved, and a second human body skeleton is obtained.

According to another aspect of the present disclosure, a confrontational defense training apparatus of a framework behavior recognition model is provided. As shown in fig. 8, the confrontational defense training apparatus 800 of the skeletal behavior recognition model includes: a first obtaining module 801, configured to obtain, from a data set, sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data, and non-manifold countermeasure sample data located outside a manifold corresponding to the data set, where the manifold countermeasure sample data is obtained based on the foregoing method for generating a countermeasure sample; a labeling module 802 configured to label a real classification value of the sample data for the sample data, the manifold countermeasure sample data, and the non-manifold countermeasure sample data, wherein the real classification value represents a behavior category corresponding to the human skeleton; a second obtaining module 803, configured to input the sample data into a skeleton behavior recognition model, and obtain a first predicted classification value of the sample data; a third obtaining module 804, configured to input the manifold immunity sample data into the skeleton behavior recognition model, and obtain a second predicted classification value of the manifold immunity sample data; a fourth obtaining module 805 configured to input the non-manifold countermeasure sample data into the skeleton behavior recognition model, and obtain a third predicted classification value of the non-manifold countermeasure sample data; a calculation module 806 configured to calculate a loss value based on the true classification value, the first predicted classification value, the second predicted classification value, and the third predicted classification value; and an adjusting module 807 configured to adjust parameters of the skeletal behavior recognition model based on the loss values.

The confrontation defense training device of the framework behavior recognition model trains the framework behavior recognition model together with the sample data of the human skeleton by introducing the manifold confrontation sample data which has the same or similar distribution with the sample data of the human skeleton and the non-manifold confrontation sample data which is positioned outside the manifold corresponding to the data set, so that the manifold confrontation sample data is used for simultaneously improving the precision and the confrontation robustness of the model, the non-manifold confrontation sample data has more aggressivity, and the framework behavior recognition model is helped to resist stronger attacks.

According to some embodiments, the penalty values comprise a first sub-penalty value, a second sub-penalty value, and a third sub-penalty value, and the calculation module 806 comprises: a first calculation unit configured to calculate the first sub-loss value based on the true classification value and the first predicted classification value; a second calculation unit configured to calculate the second sub-loss value based on the true classification value and the second predicted classification value; a third calculation unit configured to calculate the third sub-loss value based on the true classification value and the third predicted classification value; and a fourth calculation unit configured to calculate the loss value based on the first sub-loss value and the corresponding first weight, the second sub-loss value and the corresponding second weight, and the third sub-loss value and the corresponding third weight.

It is understood that, for three types of sample data, namely, sample data, manifold robust sample data, and non-manifold robust sample data, the calculation module 806 may utilize the corresponding standard classification loss function, manifold robust loss function, and non-manifold robust loss function to train them respectively, and perform the calculation of the loss value and the adjustment of the parameter. The finally trained model has higher precision and robustness, and can resist stronger attack.

As shown in fig. 9, the electronic apparatus 900 includes a computing unit 901, which can perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the electronic device 900 can also be stored. The calculation unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.

A number of components in the electronic device 900 are connected to the I/O interface 905, including: an input unit 906, an output unit 907, a storage unit 908, and a communication unit 909. The input unit 906 may be any type of device capable of inputting information to the electronic device 900, and the input unit 906 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device, and may include, but is not limited toA mouse, a keyboard, a touch screen, a track pad, a track ball, a joystick, a microphone, and/or a remote control. Output unit 907 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Storage unit 908 may include, but is not limited to, a magnetic disk, an optical disk. The communication unit 909 allows the electronic device 900 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as bluetooth ^TM Devices, 802.11 devices, WiFi devices, WiMax devices, cellular communication devices, and/or the like.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 901 performs the respective methods and processes described above, such as the method of generating a countermeasure sample and the countermeasure defense training method of the skeletal behavior recognition model. For example, in some embodiments, any of the foregoing methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 900 via the ROM 902 and/or the communication unit 909. When loaded into RAM 903 and executed by computing unit 901, may perform one or more steps of any of the methods described above. Alternatively, in other embodiments, the computing unit 901 may be configured to perform any of the methods described above in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely exemplary embodiments or examples and that the scope of the present invention is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. A method of generating a challenge sample, comprising:

the method comprises the steps of obtaining a first human body skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human body skeleton, wherein the skeleton data comprise a first rectangular coordinate representation of the first human body skeleton in a preset rectangular coordinate system;

determining a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation;

constructing a spherical coordinate system based on the first human body skeleton;

determining a corresponding spherical coordinate value of each joint node in the rest joint nodes except the root node in the plurality of joint nodes in the spherical coordinate system; and

applying an interference value to an angle of a bone in the first human skeleton, and determining a countermeasure sample based on the applied interference value, a spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and a rectangular coordinate value of the root node.

2. The method according to claim 1, wherein the applying an interference value to an angle of a bone into which each two of the plurality of joint nodes are connected, and the determining the countermeasure sample based on the applied interference value, the spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and the rectangular coordinate value of the root node comprises:

performing iteration updating on the spherical coordinate value of each of the other joint nodes for a preset number of times, wherein in each iteration, a first interference value is added to a first angle value corresponding to a first angle variable theta in the current spherical coordinate value of each of the other joint nodes, a second interference value is added to a second angle value corresponding to a second angle variable phi, and the spherical coordinate representation of a second human body skeleton corresponding to the iteration is determined based on the rectangular coordinate value of the root node;

for each iteration, calculating a perception loss value and a classification loss value corresponding to a second human body skeleton based on the first rectangular coordinate representation and a spherical coordinate representation of the second human body skeleton corresponding to the iteration; and

and determining a second human skeleton with the induction loss value smaller than the first threshold value and the classification loss value larger than the second threshold value as a confrontation sample.

3. The method of claim 2, wherein the calculating, for each iteration, a perception loss value and a classification loss value for a second human skeleton corresponding to the iteration based on the first rectilinear coordinate representation and a spherical coordinate representation of the second human skeleton comprises:

for each of the iterations it is desirable to,

determining a second rectangular coordinate representation of a second human body skeleton in the preset rectangular coordinate system based on the spherical coordinate representation of the second human body skeleton corresponding to the iteration; and

and calculating a perception loss value and a classification loss value corresponding to the second human body skeleton based on the first rectangular coordinate representation and the second rectangular coordinate representation of the second human body skeleton.

4. The method according to claim 2 or 3, wherein in each iteration, the value of the perceptual loss of the second human skeleton corresponding to the iteration is based on a perceptual loss function

Calculated, wherein n is an integer,

5. The method of any of claims 2-4, further comprising:

in each iteration, for each of the remaining joint nodes,

calculating a first partial derivative of the classification loss function to a first angle variable theta of the joint node based on the current spherical coordinate value of the joint node, and determining the first interference value based on the first partial derivative; and

and calculating a second partial derivative of the classification loss function to a second angle variable phi of the joint node based on the current spherical coordinate value of the joint node, and determining the second interference value based on the second partial derivative.

6. The method according to any one of claims 2-5, wherein the, in each iteration, adding a first interference value to a first angular value corresponding to a first angular variable θ and a second interference value to a second angular value corresponding to a second angular variable φ in the current spherical coordinate values of each of the remaining joint nodes, and determining the spherical coordinate representation of the second human skeleton corresponding to the iteration based on the rectangular coordinate values of the root node comprises:

in each iteration, for each of the remaining joint nodes,

increasing the first interference value for a first angle value in the current spherical coordinate values of the joint node to obtain an updated first angle value;

adding the second interference value to a second angle value in the current spherical coordinate values of the joint node to obtain an updated second angle value;

determining an angle constraint range based on the kinematics characteristics of the human body;

respectively carrying out mapping calculation on the updated first angle value and the updated second angle value based on the angle constraint range to obtain an updated spherical coordinate value corresponding to the joint node;

and determining the spherical coordinate representation of the second human body skeleton corresponding to the iteration based on the updated spherical coordinate value corresponding to each joint node in the rest joint nodes and the rectangular coordinate value of the root node.

7. The method according to any one of claims 1-6, wherein the constructing a spherical coordinate system based on the first human skeleton comprises:

and constructing a spherical coordinate system corresponding to the rest joint nodes one to form the spherical coordinate system based on the rest joint nodes in the first human body skeleton, wherein the spherical coordinate system corresponding to each of the rest joint nodes takes a first joint node adjacent to the joint node as a coordinate origin, and the first joint node is closer to the root node than the other joint nodes adjacent to the joint node.

8. The method of any one of claims 1-7, wherein the root node is a joint node of a bottommost spine in the human skeletal frame.

9. A confrontation defense training method for a framework behavior recognition model comprises the following steps:

acquiring sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data and non-manifold countermeasure sample data outside a manifold corresponding to the data set from a data set, wherein the manifold countermeasure sample data is obtained based on the method according to claims 1-7;

marking real classification values of the sample data, the manifold countermeasure sample data and the non-manifold countermeasure sample data, wherein the real classification values represent behavior classes corresponding to the human skeleton;

inputting the sample data into a skeleton behavior recognition model, and acquiring a first prediction classification value of the sample data;

inputting the manifold countermeasure sample data into the skeleton behavior identification model, and acquiring a second prediction classification value of the manifold countermeasure sample data;

inputting the non-manifold countermeasure sample data into the skeleton behavior identification model, and acquiring a third prediction classification value of the non-manifold countermeasure sample data;

calculating a loss value based on the true classification value, the first predicted classification value, the second predicted classification value, and the third predicted classification value; and

adjusting parameters of the skeletal behavior recognition model based on the loss values.

10. The method of claim 9, wherein the penalty value comprises a first sub-penalty value, a second sub-penalty value, and a third sub-penalty value, and the calculating a penalty value comprises:

calculating the first sub-loss value based on the true classification value and the first predicted classification value;

calculating the second sub-loss value based on the true classification value and the second predicted classification value;

calculating the third sub-loss value based on the true classification value and the third predicted classification value; and

calculating the loss value based on the first sub-loss value and the corresponding first weight, the second sub-loss value and the corresponding second weight, and the third sub-loss value and the corresponding third weight.

11. An apparatus for generating a challenge sample, comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is configured to acquire a first human skeleton formed by connecting a plurality of joint nodes and skeleton data corresponding to the first human skeleton, and the skeleton data comprises a first rectangular coordinate representation of the first human skeleton in a preset rectangular coordinate system;

a first determination module configured to determine a rectangular coordinate value of a root node of the plurality of joint nodes based on the first rectangular coordinate representation;

a construction module configured to construct a spherical coordinate system based on the first human skeleton;

a second determining module configured to determine a spherical coordinate value corresponding to each of the rest of the plurality of joint nodes except the root node in the spherical coordinate system; and

a third determination module configured to apply an interference value to an angle of a bone in the first human skeleton, and determine a countermeasure sample based on the applied interference value, a spherical coordinate value corresponding to each of the remaining joint nodes in the spherical coordinate system, and a rectangular coordinate value of the root node.

12. The apparatus of claim 11, wherein the third determining means comprises:

the iteration unit is configured to perform iteration updating for preset times on the spherical coordinate value of each of the other joint nodes, wherein in each iteration, a first interference value is added to a first angle value corresponding to a first angle variable theta in the current spherical coordinate value of each of the other joint nodes, a second interference value is added to a second angle value corresponding to a second angle variable phi, and the spherical coordinate representation of a second human body skeleton corresponding to the iteration is determined based on the rectangular coordinate value of the root node;

a calculation unit configured to calculate, for each iteration, a perception loss value and a classification loss value corresponding to a second human skeleton based on the first rectilinear coordinate representation and a spherical coordinate representation of the second human skeleton corresponding to the iteration; and

a determination unit configured to determine a second human skeleton, of which the induction loss value is less than a first threshold and the classification loss value is greater than a second threshold, as a confrontation sample.

13. The apparatus of claim 12, wherein the computing unit comprises:

a first determining subunit configured to determine, for each iteration, a second rectangular coordinate representation of a second human body skeleton in the preset rectangular coordinate system based on a spherical coordinate representation of the second human body skeleton corresponding to the iteration; and

and the first calculating subunit is configured to calculate a perception loss value and a classification loss value corresponding to the second human skeleton based on the first rectangular coordinate representation and the second rectangular coordinate representation of the second human skeleton.

14. The apparatus according to claim 12 or 13, wherein the calculation unit is further configured to, in each iteration, based on a perceptual loss function

15. The apparatus of any of claims 12-14, further comprising:

a fourth determination module configured to, for each of the remaining joint nodes, calculate, based on a current spherical coordinate value of the joint node, a first partial derivative of the classification loss function with respect to a first angular variable θ of the joint node, and determine the first interference value based on the first partial derivative, in each iteration; and

a fifth determination module configured to calculate a second partial derivative of the classification loss function for a second angle variable φ of the joint node based on current spherical coordinate values of the joint node, and determine the second interference value based on the second partial derivative.

16. The apparatus according to any one of claims 12-15, wherein the iteration unit comprises:

a first adding subunit configured to, in each iteration, add, for each of the remaining joint nodes, the first interference value to a first angle value in the current spherical coordinate values of that joint node to obtain an updated first angle value;

a second increasing subunit, configured to increase the second interference value for a second angle value in the current spherical coordinate value of the joint node to obtain an updated second angle value;

a second determination subunit configured to determine an angle constraint range based on a kinematic characteristic of the human body;

the second calculation subunit is configured to perform mapping calculation on the updated first angle value and the updated second angle value respectively based on the angle constraint range to obtain an updated spherical coordinate value corresponding to the joint node;

a third determining subunit configured to determine a spherical coordinate representation of the second human skeleton corresponding to the iteration based on the updated spherical coordinate value corresponding to each of the remaining joint nodes and the rectangular coordinate value of the root node.

17. The apparatus of any of claims 11-16, wherein the build module is further configured to:

18. The apparatus according to any one of claims 11-17, wherein the root node is a joint node of a bottommost spine in the human skeletal frame.

19. An confrontation defense training device of a framework behavior recognition model, comprising:

a first obtaining module configured to obtain, from a data set, sample data representing a human skeleton, manifold countermeasure sample data corresponding to the sample data, and non-manifold countermeasure sample data located outside a manifold corresponding to the data set, wherein the manifold countermeasure sample data is obtained based on the method according to claims 1 to 7;

a marking module configured to mark a real classification value of the sample data for the sample data, the manifold countermeasure sample data and the non-manifold countermeasure sample data, wherein the real classification value characterizes a behavior category corresponding to the human skeleton;

the second acquisition module is configured to input the sample data into a skeleton behavior recognition model and acquire a first prediction classification value of the sample data;

a third obtaining module, configured to input the manifold immunity sample data into the skeleton behavior recognition model, and obtain a second prediction classification value of the manifold immunity sample data;

a fourth obtaining module configured to input the non-manifold countermeasure sample data into the skeleton behavior recognition model, and obtain a third predicted classification value of the non-manifold countermeasure sample data;

a calculation module configured to calculate a loss value based on the true classification value, the first predicted classification value, the second predicted classification value, and the third predicted classification value; and

an adjustment module configured to adjust parameters of the skeletal behavior recognition model based on the loss values.

20. The apparatus of claim 19, wherein the penalty value comprises a first sub-penalty value, a second sub-penalty value, and a third sub-penalty value, and the means for calculating comprises:

a first calculation unit configured to calculate the first sub-loss value based on the true classification value and the first predicted classification value;

a second calculation unit configured to calculate the second sub-loss value based on the true classification value and the second predicted classification value;

a third calculation unit configured to calculate the third sub-loss value based on the true classification value and the third predicted classification value; and

a fourth calculation unit configured to calculate the loss value based on the first sub-loss value and the corresponding first weight, the second sub-loss value and the corresponding second weight, and the third sub-loss value and the corresponding third weight.

21. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.

22. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-10.

23. A computer program product comprising a computer program, wherein the computer program realizes the method of any one of claims 1-10 when executed by a processor.