CN111161789B - Analysis method and device for key areas of model prediction - Google Patents

Analysis method and device for key areas of model prediction Download PDF

Info

Publication number
CN111161789B
CN111161789B CN201911268037.1A CN201911268037A CN111161789B CN 111161789 B CN111161789 B CN 111161789B CN 201911268037 A CN201911268037 A CN 201911268037A CN 111161789 B CN111161789 B CN 111161789B
Authority
CN
China
Prior art keywords
sample data
model
covering
prediction model
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911268037.1A
Other languages
Chinese (zh)
Other versions
CN111161789A (en
Inventor
蒋佳新
胡帆
殷鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201911268037.1A priority Critical patent/CN111161789B/en
Publication of CN111161789A publication Critical patent/CN111161789A/en
Application granted granted Critical
Publication of CN111161789B publication Critical patent/CN111161789B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The application is applicable to the technical field of data processing, and provides a method for analyzing a key region of model prediction, which comprises the following steps: obtaining a prediction model; acquiring sample data, and carrying out a covering experiment on the sample data based on a prediction model to obtain output data about the prediction model before and after the covering experiment is carried out on the sample data; calculating an importance index value of the covering region according to output data before and after the covering experiment; and drawing according to the importance index value to obtain a key region of model prediction. According to the application, a covering experiment is carried out on sample data based on a prediction model, output data of the prediction model is obtained, a measurement index of a key region of the prediction model is obtained through calculation according to the output data, and a graph is drawn, so that the key region of model prediction can be visually represented, which regions are the key regions related to model prediction are displayed, a mechanism of the prediction model to play a role is disclosed, the interpretability of a deep learning model is improved, and the application range of a deep learning technology is effectively expanded.

Description

Analysis method and device for key areas of model prediction
Technical Field
The application belongs to the technical field of data processing, and particularly relates to a method and a device for analyzing a key region of model prediction.
Background
The deep learning method is a representation learning method, and can automatically learn and characterize according to training data and a data-driven model, so that the effect obviously superior to that of the traditional learning method is achieved in the fields of computer vision, natural language processing and the like.
Due to the nature of the deep learning model, it is difficult to interpret by language. It is difficult for a user to determine how the model functions, how the model learns what features, and whether false "good" performance is achieved simply because of problems such as data leakage. Thus, deep learning is limited in many fields of application, such as medical, transportation, and financial fields.
Disclosure of Invention
The embodiment of the application provides a method and a device for analyzing a key region of model prediction, which can solve the problem that the application field of a deep learning model is limited because the deep learning model cannot be interpreted in the prior art.
In a first aspect, an embodiment of the present application provides a method for analyzing a key area of model prediction, including:
obtaining a prediction model;
acquiring sample data, carrying out a covering experiment on the sample data based on the prediction model, and respectively acquiring output data of the prediction model before the covering experiment is carried out on the sample data and output data of the prediction model after the covering experiment is carried out on the sample data;
calculating an importance index value of a covering region according to output data of a prediction model before the covering experiment on the sample data and output data of a prediction model after the covering experiment on the sample data;
and obtaining a key region of model prediction according to the importance index value drawing.
In a second aspect, an embodiment of the present application provides an apparatus for analyzing a key area of model prediction, including:
the acquisition module is used for acquiring the prediction model;
the experiment module is used for randomly sampling, obtaining sample data, carrying out a covering experiment on the sample data based on the prediction model, and respectively obtaining output data of the prediction model before the covering experiment on the sample data and output data of the prediction model after the covering experiment on the sample data;
the computing module is used for computing importance index values of the covering areas according to the output data of the prediction model before the covering experiment on the sample data and the output data of the prediction model after the covering experiment on the sample data;
and the output module is used for drawing a graph according to the importance index value to obtain a key area of model prediction.
In a third aspect, an embodiment of the present application provides a server, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for analyzing key areas of model prediction according to any one of the first aspect when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements a method of analyzing key regions of model predictions as set forth in any one of the first aspects above.
In a fifth aspect, an embodiment of the present application provides a computer program product, which when run on a terminal device, causes the terminal device to perform the method for analyzing a key region of a model prediction according to any one of the first aspects above.
It will be appreciated that the advantages of the second to fifth aspects may be found in the relevant description of the first aspect, and are not described here again.
According to the embodiment of the application, based on the prediction model, a covering experiment is carried out on the sample, the importance degrees of different areas of the sample are obtained, and the result is displayed in a thermodynamic diagram mode. The method can intuitively display which areas are key areas related to model prediction, reveal a mechanism of the prediction model to play a role, improve the interpretability of the deep learning model and effectively expand the application range of the deep learning technology.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for analyzing a key region of model prediction according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a one-dimensional coordinate system corresponding to input information in one embodiment of the present application;
FIG. 3 is a schematic diagram of a two-dimensional coordinate system corresponding to input information according to another embodiment of the present application;
FIG. 4 is a schematic diagram of a three-dimensional coordinate system corresponding to input information according to another embodiment of the present application;
FIG. 5 is a schematic diagram of a three-dimensional coordinate system corresponding to input information according to another embodiment of the present application;
FIG. 6 is a visual representation of importance index values in one embodiment of the present application;
FIG. 7 is a visual representation of importance index values according to another embodiment of the present application;
FIG. 8 is a schematic structural diagram of an analysis device for a key region of model prediction according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
The analysis method of the key area of the model prediction provided by the embodiment of the application can be applied to terminal equipment such as mobile phones, tablet computers, vehicle-mounted equipment, notebook computers, ultra-mobile Personal Computer (UMPC), netbooks, personal digital assistants (Personal Digital Assistant, PDA) and the like, and the embodiment of the application does not limit the specific types of the terminal equipment.
Fig. 1 shows a schematic flow chart of a method for analyzing a key area of model prediction provided by the present application, which can be applied to any of the above terminal devices by way of example and not limitation.
S101, acquiring a prediction model.
In a specific application, a deep learning model is built; training parameters of the deep learning model according to the training set data; and selecting a deep learning model with a better output effect and a super parameter corresponding to the deep learning model according to the output effect of the verification set data. The deep learning model includes, but is not limited to, at least one of a convolutional neural network and a recurrent neural network. For example, a convolutional neural network model is selected as a prediction model, and a training optimizer is Adam and a corresponding super-parameter learning rate is obtained.
S102, acquiring sample data, and carrying out a covering experiment on the sample data based on the prediction model to respectively acquire output data of the prediction model before the covering experiment on the sample data and output data of the prediction model after the covering experiment on the sample data;
in a specific application, first, an importance index value K is defined to measure the influence of the region where the masking experiment is performed on the model prediction result.
Then, sample data satisfying a preset condition is acquired from the test set, and a coordinate system is defined on the sample data. And establishing a corresponding covering window according to the type of the coordinate system, performing a covering experiment on the sample data in a sliding manner, and respectively obtaining output data of a prediction model before the covering experiment on the sample data and output data of a prediction model after the covering experiment on the sample data so as to calculate an importance index value of the covering part. The sample data is the input information of the prediction model, and the dimension and the type of the sample data can be specifically set according to the actual situation. For example, in an experiment for predicting whether or not a protein and a small molecule have binding activity, a one-dimensional sequence of the protein and a one-dimensional sequence of a small molecule compound may be inputted as sample data into a prediction model, and masking experiments are performed on protein sequence information based on the prediction model to obtain a critical region of the protein sequence with respect to model prediction, which may represent a region that plays a critical role in the occurrence of a phenomenon in which the protein binds to the small molecule compound. The preset conditions can be specifically set according to actual conditions; for example, in the predictive face recognition algorithm, the preset condition may be image data having a face.
For example, in an experiment for predicting which region in a face has an important influence on recognition in a face recognition algorithm, an image with a face may be input into a prediction model as sample data, and a masking experiment is performed by the prediction model to obtain a key region of interest of the prediction model, where the key region position may represent a face part having a decisive basis for determination in the face recognition algorithm. For example, if the key region corresponds to the eyes of the face, the decisive judgment basis for the face recognition algorithm is the eyes.
By way of example and not limitation, the importance index value of a region of sample data may be evaluated by calculating the difference between the probability values of the prediction model for the correct prediction category before and after the region is subjected to the masking experiment to represent the influence and importance of the masking portion on the correctness of the prediction result of the prediction model.
S103, calculating an importance index value of a covering region according to output data of a prediction model before the covering experiment on the sample data and output data of a prediction model after the covering experiment on the sample data;
in a specific application, a corresponding coordinate system is established according to the dimension of the sample data, any coordinate position of the coordinate system is used as the center of a shielding window, elements in the shielding window area are shielded, the difference between output data of a prediction model before shielding experiments and output data (namely probability values) of the prediction model after the shielding experiments are calculated for any coordinate position, namely an importance index value K of the shielding area is obtained, and the importance index value K of the coordinate position is used as an importance index value of the coordinate position.
And S104, drawing a graph according to the importance index value to obtain a key region of model prediction.
In a specific application, an importance tuple ((0, K) consisting of the coordinate position and the importance index value K corresponding to the coordinate position is obtained through a plurality of sliding window shading experiments i0 ),(1,K i1 ),...,(t,K it ) After that, drawing a map according to an importance tuple consisting of the coordinate positions and the importance index values K corresponding to the coordinate positions, and visually displaying the importance index values to intuitively obtain a region playing a key role on the prediction model.
In one embodiment, the analysis method of the key area predicted by the model is used for analyzing the key area of the predicted result of the prediction model and determining the mode of the prediction model to function.
In one embodiment, the predictive model is a deep learning model;
step S101, including:
building a preset deep learning model;
training parameters of the deep learning model according to the data of the training set;
verifying and selecting the super parameters of the trained deep learning model according to the verification set to obtain a target deep learning model as a prediction model; wherein the deep learning model includes at least one of a convolutional neural network and a recurrent neural network.
In a specific application, the parameters of the deep learning model are trained through a large amount of training data in the training set, so that the trained deep learning model can analyze and identify corresponding target data, and the super parameters of the trained deep model are verified and selected through the data of the verification set, so that the deep learning model with the optimal super parameters is obtained as the target deep learning model, namely, the prediction model.
For example, in the predictive face recognition algorithm, an image with a face is target data, and parameters of the deep learning model are trained by a large amount of training data in a training set, so that the trained deep learning model can analyze and recognize the face in the image.
The super-parameter optimal deep learning model can be the deep learning model with highest recognition efficiency and accuracy.
Wherein the deep learning model includes, but is not limited to, at least one of a convolutional neural network and a recurrent neural network.
The prediction model comprises a classification task oriented and a regression task oriented, can be developed based on classification problems, can be applied to regression prediction problems, and can be applied to convolutional neural networks and cyclic neural networks.
In one embodiment, step S102 includes:
s1021, acquiring sample data meeting preset conditions from the test set;
s1022, establishing a corresponding coordinate system according to the dimension of the sample data;
s1023, establishing a covering window with a corresponding size according to the type of the coordinate system, sliding the covering window, covering the sample data, and respectively obtaining output data of the prediction model before the covering experiment of the sample data and output data of the prediction model after the covering experiment of the sample data.
In a specific application, sample data s meeting preset conditions are obtained from a test data set i (i=0, 1,2,) n-1, where n is the number of samples in the test set. Training sample data s i Can be formalized as a binary group (input i ,v i ) Wherein input is i Input information representing a predictive model, v i Is the supervision value of the predictive model. The preset condition may be specifically set according to the actual situation, for example, sample data with a preset tag is obtained, or sample data is randomly obtained.
Establishing a coordinate system of a corresponding type according to the dimension of the sample data, establishing a covering window of a corresponding size according to the type of the coordinate system, sliding the covering window under the condition of keeping other information unchanged, tracking the difference value of output values before and after covering of the prediction model, and calculating an importance index K ij Wherein the subscript j is used for designating that the covering area is in the input information input i Is included in the region of the object.
The specific process of defining the coordinate system is as follows:
if input information input i In the case of one-dimensional sequence information, for example, the input information is one-dimensional sequence of proteins and small molecules, alongThe one-dimensional sequence direction establishes a one-dimensional coordinate system, and the starting point is set at the sequence starting point position as shown in fig. 2.
If input information input i In the case of a two-dimensional matrix, two coordinate axes x and y are respectively established along two dimensions of the two-dimensional matrix, and a starting point is fixed at the intersection point of the two coordinate axes, as shown in fig. 3.
If input information input i When the three-dimensional tensor is a three-dimensional tensor, different coordinate systems are established according to different application scenes.
In an application scenario where each position element in the three-dimensional tensor represents an independent meaning, a coordinate system as shown in fig. 4 is established. For example, when inputting information input i When the three-dimensional tensor is a three-dimensional tensor of the compound, each coordinate point represents information of the compound on a three-dimensional space, and each coordinate point expresses an independent meaning.
If input information input i Each element in three dimensions is composed in an independent sense, and when a plurality of elements form the independent sense together, a coordinate system shown in fig. 5 is established. For example, when inputting information input i In the case of an RGB color image, the three channels of each pixel together form a pixel that is considered to be meaningful to the naked eye. Essentially, three-dimensional input information of the type described above is still a two-dimensional input information.
In one embodiment, step S1022 includes:
if the sample data is one-dimensional sequence information, a one-dimensional coordinate system is established along the sequence direction, and a starting point in the one-dimensional coordinate system is a sequence starting point position;
if the sample data is a two-dimensional matrix, respectively establishing a two-dimensional coordinate system along two dimensions of the two-dimensional matrix, wherein a starting point in the two-dimensional coordinate system is an intersection point of two coordinate axes;
if the sample data is a three-dimensional tensor, a three-dimensional coordinate system is established on a three-dimensional space, and the starting point in the three-dimensional coordinate system is the intersection point of three coordinate axes.
In one embodiment, step S103 includes:
obtaining a true regression value corresponding to the sample data;
taking output data of a prediction model before a covering experiment on sample data as first output data;
taking output data of the prediction model after the covering experiment is carried out on the sample data as second output data;
calculating an absolute value of a difference between the regression value and the first output data as a first absolute value;
calculating an absolute value of a difference between the regression value and the second output data as a second absolute value;
calculating the sum of the second absolute value and the disturbance parameter, and calculating the quotient of the first absolute value and the sum as an importance index value of the covering area; wherein the perturbation parameter is greater than 0.
In a specific application, the importance index value K can be calculated through a true regression value of sample data, output data of a prediction model before a covering experiment is carried out on the sample data, and output data of the prediction model after the covering experiment is carried out on the sample data;
specifically, the obtaining process of the importance index value K is as follows: the absolute value of the difference between the true regression value of the sample data and the output data of the prediction model before the covering experiment is carried out on the sample data (i.e. when the covering experiment is not carried out on the sample data) is calculated and taken as a first absolute value, the absolute value of the difference between the true regression value of the sample data and the output data of the prediction model after the covering experiment is carried out on the sample data is calculated and taken as a second absolute value, the sum of the second absolute value and the disturbance parameter is calculated, and then the quotient of the first absolute value and the sum of the second absolute value and the disturbance parameter is calculated. Can be represented by the following formula;
wherein v represents a true regression value corresponding to the sample data; p represents output data before the covering experiment is performed on the sample data, p' represents output data after the covering experiment is performed on the sample data, epsilon is a disturbance parameter, the disturbance parameter is larger than 0, and the disturbance parameter can be specifically set to be a very small positive number, so that the situation that the importance index value K cannot be obtained due to zero denominator in the expression is avoided. The disturbance parameter may be specifically set according to the actual situation, for example, it is set to 0.0000001.
The size of the covering window is essentially a super parameter, which can be specifically adjusted according to the actual situation.
In particular, the hiding experiments on elements in the hiding window include, but are not limited to, two schemes:
1. setting the element value of the covering area to 0;
2. the average value of the element values excluding the elements under the mask window is calculated and set as the element value in the mask window.
In this embodiment, the sliding step length of the covering window may be set to 1, so as to ensure that the image drawn according to the importance index value is the same as the original input image in size, so that visual display is facilitated, and the importance index value is more intuitive and accurate.
Specifically, in the case where the sample data is one-dimensional sequence information, as shown in fig. 2, the masking window size is set to size. Then, the coordinates in the established one-dimensional coordinate system are determined fromTo->Within (wherein, < >)>n represents the length of the one-dimensional sequence of inputs), the inputs after the masking test are denoted as inputs ij '。
In the case where the sample data is two-dimensional matrix information, as shown in fig. 3, the masking window size is set to size_x, size_y. Then, the slave coordinate point A is paired in the two-dimensional coordinate systemTo sit onPunctuation markWithin (wherein, < >)> n represents the length of the input information in the x coordinate axis direction. />m represents the length of the input information in the y-axis direction), and the input after the coverage test is recorded as input ij '。
In the application scenario that the sample data is three-dimensional tensor information and each position element in the three-dimensional tensor represents an independent meaning, as shown in fig. 4, the sizes of the covering windows are set to be size_x, size_y and size_z respectively. From coordinate point C in three-dimensional coordinate systemTo D(wherein-> d represents any element of a rectangular parallelepiped region in which the input information is within the length (or depth) range in the z-axis direction, performing a sliding masking experiment, and recording the input after the masking experiment as input ij '。
For an application scenario in which sample data is three-dimensional tensor information, each element is independently defined in three dimensions, and multiple elements together form independent meanings, as shown in fig. 5, a covering window is setThe dimensions are size_x, size_y, respectively. The covering area is a secondary line segment EF #0 to d-1) to segment GH (++>0 to (d-1)) and the input after the masking experiment was designated as input ij '。
In one embodiment, step S104 includes:
s1041, calculating an importance index value of a region corresponding to any coordinate in a coordinate system;
s1042, drawing a thermodynamic diagram according to any coordinate and an importance index value of a region corresponding to the coordinate;
s1043, acquiring a pixel region with a brightness value higher than a preset brightness threshold in the thermodynamic diagram as a key region of model prediction.
In a specific application, calculating the change value of a predicted value output by a prediction model before and after a covering experiment of a region corresponding to any coordinate to obtain an importance index value of the region corresponding to the coordinate, drawing a thermodynamic diagram according to the coordinate and the importance index value of the region corresponding to the coordinate, and obtaining a pixel region with a brightness value higher than a preset brightness threshold value in the thermodynamic diagram as a key region of model prediction.
In combination with the above, the calculation of the importance index value K of the region corresponding to any coordinate may be further expressed as:
wherein p is ij Input after the masking experiment ij The 'predicted value of the corresponding prediction model'. K (K) ij Input information representing the predictive model i In the index value of importance degree of the coordinate position corresponding to j.
According to the above, any coordinates can be obtainedImportance index tuple ((0, K) corresponding to position and importance index value i0 ),(1,K i1 ),...,(t,K it ))。
And the importance index tuple ((0, K) i0 ),(1,K i1 ),...,(t,K it ) A thermodynamic diagram is drawn, and a pixel region with the brightness value higher than a preset brightness threshold value in the obtained diagram is calculated to be used as a key region of model prediction. The preset brightness threshold value can be selected as 1 according to the definition meaning of the K value; it is also possible to make specific settings according to the actual situation, for example, to set the preset brightness threshold to (max+1)/2, where max is all samples s i The maximum value of all the masking results k.
In the present embodiment, the degree index tuple ((0, K) is set i0 ),(1,K i1 ),...,(t,K it ) A thermodynamic diagram is plotted.
The obtained importance result thermodynamic diagram has a smaller size than the original input data, and the importance of the boundary portion disappears. Taking one-dimensional input as an example, there is no index value in the importance index value list such as (j, K) ij )Is a value of (2). However, when the importance index value is visually displayed and analyzed, the position of the importance index value may be filled (for example, 1 or 0 may be selected for filling), thereby improving the effect of visual display of the importance index value.
Fig. 6 is a visual representation of importance index values obtained by a sliding cover experiment in a prediction experiment of protein-small molecule binding activity.
For example, a prediction model built by a convolutional neural network model is obtained, a one-dimensional sequence of a protein and a one-dimensional sequence of a small molecule are used as input information (namely sample data) of the prediction model, a masking experiment is carried out on the one-dimensional sequence of the protein and the one-dimensional sequence of the small molecule through the prediction model, output data of the prediction model after the masking experiment is obtained, an importance index value of a masking region is obtained according to the output data before the masking experiment and the output data after the masking experiment, an importance tuple corresponding to the importance index value K is obtained, a thermodynamic diagram is drawn, and a region playing a key role in the prediction model of the protein-small molecule binding activity (namely a region where the importance index value is located, namely a key region of model prediction) is shown in fig. 6.
In one embodiment, before step S1043, the method further includes:
and superposing the thermodynamic diagram and the original data to obtain a visual display diagram of the key area.
In a specific application, the visualized display diagram of the key area can be obtained by superposing the drawn thermodynamic diagram and the original image information.
As shown in fig. 7, a visual representation of the importance index value obtained by superimposing the plotted thermodynamic diagram with the original image information is shown, and the highlighted region corresponds to the region in the actual application (determined by biological experimental means) that plays a key role in the binding between the protein and the small molecule.
According to the embodiment, based on the prediction model, a covering experiment is carried out on the sample, the importance degrees of different areas of the sample are obtained, and the result is displayed in a thermodynamic diagram mode. The method can intuitively display which areas are key areas related to model prediction, reveal a mechanism of the prediction model to play a role, improve the interpretability of the deep learning model and effectively expand the application range of the deep learning technology.
Fig. 8 shows a block diagram of an analysis apparatus for a key region of model prediction according to an embodiment of the present application, and for convenience of explanation, only a portion related to the embodiment of the present application is shown.
Referring to fig. 8, the analysis apparatus 100 of the key region predicted by the model includes:
an acquisition module 101 for acquiring a prediction model;
the experiment module 102 is used for obtaining sample data, carrying out a covering experiment on the sample data based on the prediction model, and respectively obtaining output data of the prediction model before the covering experiment on the sample data and output data of the prediction model after the covering experiment on the sample data;
a calculating module 103, configured to calculate an importance index value of the covering region according to output data of the prediction model before the covering experiment on the sample data and output data of the prediction model after the covering experiment on the sample data;
and the output module 104 is used for drawing a graph according to the importance index value to obtain a key region of model prediction.
According to the embodiment, based on the prediction model, a covering experiment is carried out on the sample, the importance degrees of different areas of the sample are obtained, and the result is displayed in a thermodynamic diagram mode. The method can intuitively display which areas are key areas related to model prediction, reveal a mechanism of the prediction model to play a role, improve the interpretability of the deep learning model and effectively expand the application range of the deep learning technology.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.
Fig. 9 is a schematic structural diagram of a server according to an embodiment of the present application. As shown in fig. 9, the server 9 of this embodiment includes: at least one processor 90 (only one shown in fig. 9), a memory 91, and a computer program 92 stored in the memory 91 and executable on the at least one processor 90, the processor 90 executing the computer program 92 performing the steps in the above described embodiments of the method of analyzing key regions of any respective model predictions.
The terminal device 9 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The server may include, but is not limited to, a processor 90, a memory 91. It will be appreciated by those skilled in the art that fig. 9 is merely an example of the server 9 and is not meant to be limiting as the server 9, and may include more or fewer components than shown, or may combine certain components, or different components, such as may also include input-output devices, network access devices, etc.
The processor 90 may be a central processing unit (Central Processing Unit, CPU), the processor 90 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 91 may in some embodiments be an internal storage unit of the server 9, such as a hard disk or a memory of the server 9. The memory 91 may in other embodiments also be an external storage device of the server 9, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the server 9. Further, the memory 91 may also include both an internal storage unit and an external storage device of the server 9. The memory 91 is used for storing an operating system, application programs, boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory 91 may also be used for temporarily storing data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The embodiment of the application also provides a server, which comprises: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, which when executed by the processor performs the steps of any of the various method embodiments described above.
Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps for implementing the various method embodiments described above.
Embodiments of the present application provide a computer program product which, when run on a mobile terminal, causes the mobile terminal to perform steps that enable the implementation of the method embodiments described above.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above-described embodiments, and may be implemented by hardware related to instructions of a computer program, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of the method embodiments described above when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing device/terminal apparatus, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (9)

1. A method for analyzing a key region of model prediction, comprising:
obtaining a prediction model;
acquiring sample data, carrying out a covering experiment on the sample data based on the prediction model, and respectively acquiring output data of the prediction model before the covering experiment is carried out on the sample data and output data of the prediction model after the covering experiment is carried out on the sample data;
calculating an importance index value of a covering region according to output data of a prediction model before the covering experiment on the sample data and output data of a prediction model after the covering experiment on the sample data;
obtaining a key area of model prediction according to the importance index value drawing;
according to the output data of the prediction model before the covering experiment on the sample data and the output data of the prediction model after the covering experiment on the sample data, calculating the importance index value of the covering area comprises the following steps:
obtaining a true regression value corresponding to the sample data;
taking output data of a prediction model before a covering experiment on sample data as first output data;
taking output data of the prediction model after the covering experiment is carried out on the sample data as second output data;
calculating an absolute value of a difference between the regression value and the first output data as a first absolute value;
calculating an absolute value of a difference between the regression value and the second output data as a second absolute value;
calculating the sum of the second absolute value and the disturbance parameter, and calculating the quotient of the first absolute value and the sum as an importance index value of the covering area; wherein the perturbation parameter is greater than 0.
2. The method for analyzing a key region predicted by a model according to claim 1, wherein the prediction model is a deep learning model;
the obtaining the prediction model includes:
building a preset deep learning model;
training parameters of the deep learning model according to the data of the training set;
verifying and selecting the super parameters of the trained deep learning model according to the verification set to obtain a target deep learning model as a prediction model; wherein the deep learning model includes at least one of a convolutional neural network and a recurrent neural network.
3. The method for analyzing a critical area predicted by a model according to claim 1, wherein the steps of obtaining sample data, performing a masking experiment on the sample data based on the prediction model, and obtaining output data of the prediction model before the masking experiment on the sample data and output data of the prediction model after the masking experiment on the sample data, respectively, comprise:
acquiring sample data meeting preset conditions from a test set;
establishing a corresponding coordinate system according to the dimension of the sample data;
and establishing a covering window with a corresponding size according to the type of the coordinate system, sliding the covering window, covering the sample data, and respectively obtaining output data of the prediction model before the covering experiment of the sample data and output data of the prediction model after the covering experiment of the sample data.
4. A method of analyzing a critical area predicted by a model according to claim 3, wherein the establishing a corresponding coordinate system according to the dimension of the sample data comprises:
if the sample data is one-dimensional sequence information, a one-dimensional coordinate system is established along the sequence direction, and a starting point in the one-dimensional coordinate system is a sequence starting point position;
if the sample data is a two-dimensional matrix, respectively establishing a two-dimensional coordinate system along two dimensions of the two-dimensional matrix, wherein a starting point in the two-dimensional coordinate system is an intersection point of two coordinate axes;
if the sample data is a three-dimensional tensor, a three-dimensional coordinate system is established on a three-dimensional space, and the starting point in the three-dimensional coordinate system is the intersection point of three coordinate axes.
5. The method for analyzing a key region of model prediction according to claim 1, wherein the drawing the map according to the importance index value to obtain the key region of model prediction comprises:
calculating an importance index value of a region corresponding to any coordinate in a coordinate system;
drawing a thermodynamic diagram according to any coordinate and an importance index value of a region corresponding to the coordinate;
and acquiring a pixel region with a brightness value higher than a preset brightness threshold value in the thermodynamic diagram as a key region of model prediction.
6. The method for analyzing a key region of model prediction according to claim 5, wherein before the step of obtaining a pixel region with a luminance value higher than a preset luminance threshold in the thermodynamic diagram as the key region of model prediction, further comprises:
and superposing the thermodynamic diagram and the original data to obtain a visual display diagram of the key area.
7. An analysis device for a key region of model prediction, comprising:
the acquisition module is used for acquiring the prediction model;
the experiment module is used for acquiring sample data, carrying out a covering experiment on the sample data based on the prediction model, and respectively acquiring output data of the prediction model before the covering experiment on the sample data and output data of the prediction model after the covering experiment on the sample data;
the computing module is used for computing importance index values of the covering areas according to the output data of the prediction model before the covering experiment on the sample data and the output data of the prediction model after the covering experiment on the sample data;
the output module is used for drawing a graph according to the importance index value to obtain a key area of model prediction;
the computing module is specifically configured to:
obtaining a true regression value corresponding to the sample data;
taking output data of a prediction model before a covering experiment on sample data as first output data;
taking output data of the prediction model after the covering experiment is carried out on the sample data as second output data;
calculating an absolute value of a difference between the regression value and the first output data as a first absolute value;
calculating an absolute value of a difference between the regression value and the second output data as a second absolute value;
calculating the sum of the second absolute value and the disturbance parameter, and calculating the quotient of the first absolute value and the sum as an importance index value of the covering area; wherein the perturbation parameter is greater than 0.
8. A server comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the computer program.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the method according to any one of claims 1 to 6.
CN201911268037.1A 2019-12-11 2019-12-11 Analysis method and device for key areas of model prediction Active CN111161789B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911268037.1A CN111161789B (en) 2019-12-11 2019-12-11 Analysis method and device for key areas of model prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911268037.1A CN111161789B (en) 2019-12-11 2019-12-11 Analysis method and device for key areas of model prediction

Publications (2)

Publication Number Publication Date
CN111161789A CN111161789A (en) 2020-05-15
CN111161789B true CN111161789B (en) 2023-10-31

Family

ID=70556719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911268037.1A Active CN111161789B (en) 2019-12-11 2019-12-11 Analysis method and device for key areas of model prediction

Country Status (1)

Country Link
CN (1) CN111161789B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633280B (en) * 2020-12-31 2023-01-31 西北大学 Countermeasure sample generation method and system
CN112801465B (en) * 2021-01-08 2024-03-01 上海画龙信息科技有限公司 Method and device for predicting product index through interactive modeling and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109875546A (en) * 2019-01-24 2019-06-14 西安交通大学 A kind of depth model classification results method for visualizing towards ECG data
CN109934226A (en) * 2019-03-13 2019-06-25 厦门美图之家科技有限公司 Key area determines method, apparatus and computer readable storage medium
CN110046654A (en) * 2019-03-25 2019-07-23 东软集团股份有限公司 A kind of method, apparatus and relevant device of identification classification influence factor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109875546A (en) * 2019-01-24 2019-06-14 西安交通大学 A kind of depth model classification results method for visualizing towards ECG data
CN109934226A (en) * 2019-03-13 2019-06-25 厦门美图之家科技有限公司 Key area determines method, apparatus and computer readable storage medium
CN110046654A (en) * 2019-03-25 2019-07-23 东软集团股份有限公司 A kind of method, apparatus and relevant device of identification classification influence factor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Vitali Petsiuk等.RISE: Randomized Input Sampling for Explanation of Black-box Models.《arXiv》.2018,第1-17页. *

Also Published As

Publication number Publication date
CN111161789A (en) 2020-05-15

Similar Documents

Publication Publication Date Title
CN111310775B (en) Data training method, device, terminal equipment and computer readable storage medium
CN107622240B (en) Face detection method and device
CN109086811B (en) Multi-label image classification method and device and electronic equipment
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
Schoening et al. RecoMIA—Recommendations for marine image annotation: Lessons learned and future directions
CN108399386A (en) Information extracting method in pie chart and device
US11244157B2 (en) Image detection method, apparatus, device and storage medium
Zhang et al. The use of ROC and AUC in the validation of objective image fusion evaluation metrics
JP2022524878A (en) Image analysis method, device, program
US20210004648A1 (en) Computer Vision Systems and Methods for Blind Localization of Image Forgery
CN111161789B (en) Analysis method and device for key areas of model prediction
CN110222641B (en) Method and apparatus for recognizing image
CN110941978B (en) Face clustering method and device for unidentified personnel and storage medium
Somanchi et al. Discovering anomalous patterns in large digital pathology images
CN112749181B (en) Big data processing method aiming at authenticity verification and credible traceability and cloud server
CN110969200A (en) Image target detection model training method and device based on consistency negative sample
CN108875903A (en) Method, apparatus, system and the computer storage medium of image detection
CN112990318A (en) Continuous learning method, device, terminal and storage medium
CN115457364A (en) Target detection knowledge distillation method and device, terminal equipment and storage medium
Kienbaum et al. DeepCob: Precise and high-throughput analysis of maize cob geometry using deep learning with an application in genebank phenomics
Blücher et al. PredDiff: Explanations and interactions from conditional expectations
CN113192639B (en) Training method, device, equipment and storage medium of information prediction model
Prematilake et al. Evaluation and prediction of polygon approximations of planar contours for shape analysis
CN114818828A (en) Training method of radar interference perception model and radar interference signal identification method
CN113869367A (en) Model capability detection method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant