CN118098377A - Biological safety database management method and system - Google Patents

Biological safety database management method and system Download PDF

Info

Publication number
CN118098377A
CN118098377A CN202410487657.9A CN202410487657A CN118098377A CN 118098377 A CN118098377 A CN 118098377A CN 202410487657 A CN202410487657 A CN 202410487657A CN 118098377 A CN118098377 A CN 118098377A
Authority
CN
China
Prior art keywords
data
biosafety
target
biological safety
target class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410487657.9A
Other languages
Chinese (zh)
Inventor
肖娜
赵超
马军海
张兮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202410487657.9A priority Critical patent/CN118098377A/en
Publication of CN118098377A publication Critical patent/CN118098377A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a biological safety database management method and a biological safety database management system, which are characterized in that biological safety target data are classified to obtain action mechanism key information fragments of the biological safety target class data, the action mechanism key information fragments of the biological safety target class data are loaded to a biological reaction unit path structural model to obtain a biological safety target class data reaction unit path, path events are extracted from the biological safety target class data reaction unit path to carry out biological safety risk mark event detection to obtain biological safety target class first characteristic data, a biological safety target data risk tag is obtained through a safety risk diagnosis model by combining biological safety target class second characteristics, and a management mechanism is matched with biological safety target data based on the biological safety target data risk tag. Therefore, more efficient and accurate biological safety data detection and evaluation can be realized, and management decisions can be matched in a targeted manner.

Description

Biological safety database management method and system
Technical Field
The invention relates to the technical field of biological information, in particular to a biological safety database management method and system.
Background
Traditional biological safety data management methods mainly rely on manual collection and classification of data, which results in low efficiency and easy error, and have limitations on biological safety management, so that the requirements of large-scale data processing are difficult to deal with. Although some systems try to recommend management decisions according to biological data, the systems cannot deeply consider the influence of factors on management decision selection, for example, the characteristics and rules of data cannot be accurately captured due to insufficient investigation of a biological essence action mechanism, insufficient feature recognition and the like, so that the management mechanism cannot be effectively and accurately matched, and a new method is needed to realize accurate and efficient management of biological safety management data under the current explosive growth of the biological safety data volume.
Disclosure of Invention
(1) Technical problem to be solved
The invention aims to provide a biological safety database management method and system, which are used for realizing high-efficiency and accurate biological safety data detection and evaluation by considering a biological essential action mechanism and deep feature analysis, so that the management mechanism is matched with the biological safety data detection and evaluation in an efficient and accurate manner.
(2) Technical proposal
To achieve the above object, in one aspect, the present invention provides a biosafety database management method, the method comprising:
acquiring biosafety target data, the biosafety target data comprising biosafety target elements;
classifying the biosafety target data to obtain biosafety target class data;
Acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into a biological reaction unit path structural model to obtain a biological safety target class data reaction unit path;
Extracting a biological safety target type data reaction unit path event from the biological safety target type data reaction unit path, and detecting a biological safety risk mark event on the biological safety target type data reaction unit path event to obtain biological safety target type first characteristic data;
the second characteristic of the biological safety target class is obtained, the first characteristic data and the second characteristic data of the biological safety target class are input into a safety risk diagnosis model to obtain a biological safety target data risk tag, and a management mechanism is matched with the biological safety target data based on the biological safety target data risk tag.
In a possible implementation manner of the first aspect, the classifying the biosafety target data to obtain biosafety target class data includes:
performing data cleaning on the biosafety target data to obtain biosafety standardized text data;
The TF-IDF vectorizes the biological safety standardized text data to obtain biological safety omnidirectional quantized text features;
Classifying the biosafety vectorization text features according to an SVM algorithm to obtain biosafety target class data.
In a possible implementation manner of the first aspect, the acquiring the action mechanism text information of the biosafety target class data, loading the action mechanism text information of the biosafety target class data into the biological reaction unit path structured model to obtain the biological safety target class data reaction unit path, includes:
defining biological safety target category data as biological safety target key word groups, and searching and obtaining literature text in a biological safety literature database according to the biological safety target key word groups;
Identifying key entities in a document text by using NER named entity identification technology, and processing the text by dependency syntax analysis to extract action mechanism key information fragments of biological safety target class data;
And loading the action mechanism key information fragment of the biological safety target class data into the biological reaction unit path structural model to obtain a biological safety target class data reaction unit path.
In a possible implementation manner of the first aspect, the biological reaction unit path structured model is a data structure capable of representing biological reaction units and paths thereof, including biological reaction unit nodes and path edges;
The loading the action mechanism key information fragment of the biosafety target class data to the biological reaction unit path structured model to obtain a biological safety target class data reaction unit path comprises the following steps:
identifying an entity in the action mechanism key information of the biological safety target class data and mapping the entity into a biological reaction unit node in the biological reaction unit path structural model;
Mapping the action relation between entities in the action mechanism key information of the biological safety target class data to path edges in the biological reaction unit path structural model;
and constructing a reaction unit path of the biosafety target class according to the nodes and the path edges of the biological reaction unit mapped into the model.
In a possible implementation manner of the first aspect, the biosafety target class data reaction unit path event is composed of two neighboring biosafety target class data reaction unit nodes and path edges thereof;
The biosafety risk mark event is a biosafety risk event list preset according to historical experience data;
the step of extracting the biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting the biosafety risk mark event to the biosafety target class data reaction unit path event to obtain biosafety target class first feature data comprises the steps of:
acquiring a biological safety target class data reaction unit path, and identifying all biological reaction unit nodes and path edges in the biological safety target class data reaction unit path;
Extracting path edges between adjacent biological reaction unit nodes to generate biological safety target class data reaction unit path events;
Matching the biological safety target type data reaction unit path event with the biological safety risk mark event to obtain a risk value corresponding to the biological safety target type data reaction unit path event;
and summarizing risk values corresponding to the biological safety target class data reaction unit path events to obtain first characteristic data of the biological safety target class.
In a possible implementation manner of the first aspect, the acquiring the second feature of the biosafety target data, inputting the first feature data and the second feature data of the biosafety target class into a security risk diagnosis model to obtain a biosafety target data risk tag, and matching a management mechanism to the biosafety target data based on the biosafety target data risk tag includes:
Acquiring a second characteristic of the biosafety target data;
normalizing the numerical characteristics in the first characteristic data and the second characteristic data of the biological safety target class, and performing independent-heat encoding on the classification characteristics in the first characteristic data and the second characteristic data of the biological safety target class to obtain a biological safety target class characteristic set;
loading the biological safety target class feature set to an input layer of a safety risk diagnosis model, and acquiring a biological safety target data risk tag from an output layer of the safety risk diagnosis model after the biological safety target class feature set is transmitted forwards through a hidden layer of the safety risk diagnosis model;
and matching and managing the biosafety target data based on the biosafety target data risk tag.
In a possible implementation manner of the first aspect, the acquiring the second feature of the biosafety target data is data acquisition by retrieving a knowledge base of basic features of the biosafety target data, and the second feature of the biosafety target data includes a gene sequence feature, an expression feature, a protein function feature, and an ecological environment feature.
In a possible implementation manner of the first aspect, the security risk diagnosis model is a model that is built based on a neural network algorithm and evaluates biosafety target data, and the security risk diagnosis model framework includes:
taking the biosafety target class feature set as input of an input layer;
First, the Layer hidden layer activation function is set to/>
First, theThe layer is an output layer, and the output layer is provided withIndividual nodes respectively output the/>Evaluation value of individual dimension, output layer (s)/>The output of the individual node is set to/>
Compiling a model by cross entropy loss function and using gradient descent methodAnd/>Performing parameter fitting;
Wherein: the input layer biosafety target class feature set is as follows ,/>For/>Weight matrix of layer,/>For/>Bias vector of layer.
In a possible implementation manner of the first aspect, the dimension set by the biosafety target data risk tag is environmental ecological safety, human health safety, animal and plant safety, and agricultural food safety; before the biometric security target data matching management mechanism based on the biometric security target data risk tag, the biometric security database management method further comprises:
Comparing the biological safety target data risk tag data with a set threshold value, and triggering a corresponding decision when the biological safety target data risk tag data is larger than the set threshold value;
Mapping each dimension value of the biological safety target data risk label to each dimension axis of the plane;
Connecting adjacent data points to form a closed polygon, and generating a biological safety target data safety characteristic radar chart;
extracting a biological safety target data safety feature radar graph-shaped feature by using an image processing library;
and matching a specific decision according to the extracted biosafety target data safety feature radar graph-shaped feature.
Based on the same inventive concept, the invention also provides a biosafety database management system, the system comprising:
a data acquisition module for acquiring biosafety target data, the biosafety target data comprising biosafety target elements;
The data classification module is used for classifying the biosafety target data to obtain biosafety target class data;
The path fitting module is used for acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into the biological reaction unit path structural model to obtain a biological safety target class data reaction unit path;
the event detection module is used for extracting a biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting a biosafety risk mark event on the biosafety target class data reaction unit path event to obtain biosafety target class first characteristic data;
The diagnosis decision module is used for acquiring second characteristics of the biological safety target class, inputting the first characteristic data and the second characteristic data of the biological safety target class into the safety risk diagnosis model to obtain a biological safety target data risk tag, and matching a management mechanism for the biological safety target data based on the biological safety target data risk tag.
(3) Advantageous effects
The beneficial effects of the invention are as follows: the biological safety target data are classified to obtain action mechanism key information fragments of biological safety target class data, the action mechanism key information fragments of the biological safety target class data are loaded to a biological reaction unit path structured model to extract path events, the biological safety data are analyzed from the action mechanism level, and the biological safety sensitivity is enhanced; the safety risk diagnosis model is combined with the first characteristic data and the second characteristic of the biological safety target class, so that a multi-dimensional biological safety target data risk label can be automatically generated, and the efficiency and the accuracy of safety assessment are improved; by further extracting dimension shape characteristics of the risk tag and matching with specific decisions, personalized management of biological safety data is realized, and pertinence and effectiveness of management decisions are improved; therefore, the more efficient and accurate biosafety data management is realized on the whole, and the biosafety decision level is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for managing a biosafety database according to embodiment 1 of the invention;
FIG. 2 is a diagram showing an example of a path of a biosafety target class data response unit according to embodiment 1 of the invention;
fig. 3 is a block diagram of a biosafety database management system according to embodiment 2 of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1: as shown in fig. 1, the present embodiment provides a biosafety database management method, which includes:
S10, acquiring biosafety target data, wherein the biosafety target data comprises biosafety target elements; the biosafety target data is data containing information related to a gene sequence, a protein structure, a metabolic pathway, a biosynthesis pathway for producing enzymatically active substances and the like of bacteria, viruses or other microorganisms, such as data related to the biosynthesis pathway when the biosafety target data is obtained, such as staphylococcus aureus, which is resistant to antibiotics, the biosafety target element included therein is staphylococcus aureus, and the biosafety target data is data related to the lactose metabolic pathway of the obtained biosafety target data, such as escherichia coli, which is included therein;
S20, classifying the biosafety target data to obtain biosafety target class data; the biosafety target data has a plurality of biosafety target element categories, such as a plurality of categories of zymogens, symbiotic bacteria, pathogenic microorganisms, anaerobes, aerobes and the like, and each category has the characteristics and commonalities as follows: the enzyme-producing bacteria can secrete enzyme substances, and the anaerobes can only grow microorganisms in an anaerobic environment, so that the identification and classification of the microorganism types can be helpful for the subsequent deep analysis of the biosafety target type data more accurately;
S30, acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into a biological reaction unit path structural model to obtain a biological safety target class data reaction unit path; if the biosafety target class data contains a biosafety target element staphylococcus aureus, the action mechanism key information fragment of the biosafety target class data possibly identifies that the staphylococcus aureus drug resistance key information fragment is a reaction unit path in the aspect of staphylococcus aureus drug resistance, wherein the staphylococcus aureus can generate beta-lactamase and can hydrolyze beta-lactam antibiotics;
S40, extracting a biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting a biosafety risk mark event on the biosafety target class data reaction unit path event to obtain biosafety target class first characteristic data; after the biosafety target class data reaction unit path is obtained, the biosafety target class data reaction unit path event is extracted according to the manner described in steps S401 to S404, and the biosafety target class first feature data is obtained by matching with the biosafety risk flag event.
S50, acquiring second characteristics of the biological safety target class, inputting the first characteristic data and the second characteristic data of the biological safety target class into a safety risk diagnosis model to obtain a biological safety target data risk tag, and matching a management mechanism to the biological safety target data based on the biological safety target data risk tag. The security risk diagnosis model is a model which is built based on a neural network algorithm and used for evaluating biological security target data, the first characteristic data and the second characteristic data are input through an input layer, multidimensional data values are output from an output layer after being transmitted forwards through calculation through a hidden layer, each value represents a security evaluation value with a set dimension as a biological security target data risk label, and management decisions can be pertinently matched after further analysis of the biological security target data risk label through steps S5031 to S5035.
In one possible implementation manner, the classifying the biosafety target data to obtain biosafety target class data includes:
S201, performing data cleaning on biosafety target data to obtain biosafety standardized text data; by cleaning the original biosafety target data, noise, errors or unnecessary information which may exist, and the format and structure of the unified data are removed. Such as removing words of no practical meaning such as "and", "but", etc., to ensure that the data can be used for a later step;
S202, vectorizing biological safety standardized text data through TF-IDF to obtain biological safety omnibearing quantized text features; TF-IDF is a commonly used text feature extraction method, by which biosafety standardized text data is converted into a numerical feature vector, e.g. words such as "bacteria", "antibiotics", "enzymes" are mentioned in biosafety standardized text data, and we can calculate the importance of these words in this abstract using TF-IDF method and convert them into a numerical vector for the subsequent algorithm to be able to process.
And S203, classifying the biosafety vectorization text features according to an SVM support vector machine algorithm to obtain biosafety target class data. Classifying the bioassay text features by using an SVM support vector machine algorithm, classifying the text data into different categories, for example, classifying the bioassay text features into drug-resistant biological categories when the bioassay text features mainly comprise information data such as bacteria, antibiotics, enzyme substances and the like, obtaining biosafety target category data comprising drug-resistant bacteria, antibiotics, enzyme substances and the like, and facilitating subsequent efficient and accurate acquisition of action mechanism text information of the biosafety target category data through category identification.
In one possible implementation manner, the acquiring the action mechanism text information of the biosafety target class data, loading the action mechanism text information of the biosafety target class data into the biological reaction unit path structural model to obtain the biological safety target class data reaction unit path, and includes:
S301, defining biological safety target category data as biological safety target keyword groups, and searching and obtaining literature texts in a biological safety literature database according to the biological safety target keyword groups; the biosafety literature database can select literature information contained in the database, and can also select an external database such as PubMed, web of Science, scopus and the like.
S302, identifying key entities in a document text by using NER named entity identification technology, and processing the text by dependency syntax analysis to extract action mechanism key information fragments of biological safety target class data;
S303, loading the action mechanism key information fragment of the biological safety target class data to a biological reaction unit path structural model to obtain a biological safety target class data reaction unit path.
In one possible embodiment, the biological reaction unit path structured model is a data structure capable of representing biological reaction units and paths thereof, including biological reaction unit nodes and path edges;
the loading the key information fragment of the action mechanism of the biosafety target class data into the biological reaction unit path structural model to obtain a biosafety target class data reaction unit path, wherein the biosafety target class data reaction unit path is shown in fig. 2 and comprises:
S3031, identifying an entity in the action mechanism key information of the biological safety target class data and mapping the entity into a biological reaction unit node in the biological reaction unit path structural model;
S3032, mapping the action relation between entities in the action mechanism key information of the biological safety target class data to path edges in the biological reaction unit path structural model;
S3033, constructing a reaction unit path of the biosafety target class according to the biological reaction unit nodes and the path edges mapped into the model.
For example: the key information fragment of the action mechanism of the biosafety target class data is the resistance mechanism of pathogen bacteria X to antibiotic Y, and the action mechanism is that antibiotic destroying enzyme Z generated by pathogen bacteria X can destroy the antibiotic Y. Identifying an entity pathogen bacteria X, an antibiotic Y and an antibiotic destructive enzyme Z in the action mechanism key information of the biological safety target class data and mapping the entity pathogen bacteria X, the antibiotic Y and the antibiotic destructive enzyme Z to a biological reaction unit node; the pathogen bacteria X generate antibiotic destroying enzyme Z to be mapped to a first path side, the antibiotic destroying enzyme Z destroys the antibiotic Y to be mapped to a second path side, and the biological safety target type data reaction unit path is obtained by loading the action mechanism key information fragment of the biological safety target type data into the biological reaction unit path structuring model, wherein the reaction path obtaining reaction path comprises pathogen bacteria X, generation, antibiotic destroying enzyme Z, destruction and antibiotic Y.
In one possible embodiment, the biosafety target class data reaction unit path event is composed of two bioreaction unit nodes adjacent to each other and path edges thereof;
the biosafety risk mark event is a biosafety risk event list set according to experience and historical risk event record data;
the step of extracting the biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting the biosafety risk mark event to the biosafety target class data reaction unit path event to obtain biosafety target class first feature data comprises the steps of:
s401, acquiring a biological safety target class data reaction unit path, and identifying all biological reaction unit nodes and path edges in the biological safety target class data reaction unit path;
s402, extracting path edges between adjacent biological reaction unit nodes to generate biological safety target class data reaction unit path events;
For example: biosafety target class data reaction unit path event 1: pathogen bacteria x→production→antibiotic disruption enzyme Z;
Biosafety target class data reaction unit path event 2: antibiotic destroying enzyme Z- & gt destruction- & gt antibiotic Y;
Biosafety target class data reflects unit path event N: biological reaction unit node A, path edge and biological reaction unit node B.
S403, matching the biological safety target type data reaction unit path event with the biological safety risk mark event to obtain a risk value corresponding to the biological safety target type data reaction unit path event;
s404, summarizing risk values corresponding to the biological safety target class data reaction unit path events to obtain first characteristic data of the biological safety target class.
For example: if the biosafety risk flag event fails to match event a: pathogen bacteria X-generation-antibiotic destructive enzyme Z, the event A is shown to have no safety risk, and the risk value of the event A is 0; if the biosafety risk flag event matches event B: the antibiotic destructive enzyme Z- & gt destruction- & gt antibiotic Y shows that the event B has a safety risk, and the risk value of the event B is 1; when a certain reaction unit path event is M in number, wherein the N events can be matched with the biosafety risk flag event, the first characteristic data of the biosafety target class is N.
In one possible implementation manner, the acquiring the second feature of the biosafety target data, inputting the first feature data and the second feature data of the biosafety target category into a security risk diagnosis model to obtain a biosafety target data risk tag, and matching a management mechanism to the biosafety target data based on the biosafety target data risk tag includes:
s501, acquiring a second characteristic of biosafety target data;
S502, carrying out normalization processing on numerical characteristics in the first characteristic data of the biological safety target class and the second characteristic data of the biological safety target class, and carrying out independent heat coding processing on classification characteristics in the first characteristic data of the biological safety target class and the second characteristic data of the biological safety target class to obtain a biological safety target class characteristic set; converting the features into a format which can be identified by a security risk diagnosis model through preprocessing the features; normalization processing normalization method according to minimum maximum value Treatment is carried out by carrying out single heat coding treatment on classification characteristics, such as single heat coding treatment on pathogenic genes in gene sequence characteristics, so that the pathogenic genes are/>No pathogenic gene is/>; Single heat coding treatment of gene level transfer potential in ecological environment characteristics to be high/>Middle/>Low/>
S503, loading the biological safety target class feature set to an input layer of a safety risk diagnosis model, and acquiring a biological safety target data risk tag from an output layer of the safety risk diagnosis model after the biological safety target class feature set is transmitted forwards through a hidden layer of the safety risk diagnosis model;
S504, a biosafety target data matching management mechanism is based on the biosafety target data risk tag.
In one possible implementation manner, the second feature of acquiring the biosafety target data is data acquisition by retrieving a knowledge base of basic features of the biosafety target data, and the second feature of the biosafety target data includes a gene sequence feature, an expression feature, a protein function feature and an ecological environment feature. For example: the gene sequence characteristics comprise promoter sequence efficiency, nucleotide GC proportion and pathogenicity gene; expression characteristics include expression level of the protein, expression stability; protein functional features include active site, signal peptide transport efficiency; the ecological environment characteristics comprise the decomposition rate in the water body and the gene level transfer potential.
In one possible embodiment, the security risk diagnosis model is a model for evaluating biosafety target data built based on a neural network algorithm, and the security risk diagnosis model framework includes:
taking the biosafety target class feature set as input of an input layer;
First, the Layer hidden layer activation function is set to/>
First, theThe layer is an output layer, and the output layer is provided withIndividual nodes respectively output the/>Evaluation value of individual dimension, output layer (s)/>The output of the individual node is set to/>
Compiling a model by cross entropy loss function and using gradient descent methodAnd/>Performing parameter fitting;
Wherein: the input layer biosafety target class feature set is as follows ,/>For/>Weight matrix of layer,/>For/>Bias vector of layer.
For example: the biological element is a transformed bacterium, the number of the reaction unit path events M is 20, and the risk value corresponding to the biological safety target class data reaction unit path events obtained by matching the biological safety risk mark events isNamely, 11 events can be matched with the biosafety risk mark event, the first characteristic data of the biosafety target class is 11, the minimum value of the bacterial reaction unit path event is 0, the maximum value is 20, and the characteristic data in the characteristic set is normalized,/>Namely, the normalized first characteristic data of the biological safety target class is 0.55; similarly, the second characteristic of the biosafety target data is obtained, the numerical value characteristic in the second characteristic data is normalized according to the minimum and maximum normalization mode to obtain data such as 0.71 of promoter sequence efficiency, 0.39 of nucleotide GC proportion in gene sequence characteristics, 0.95 of protein expression level in expression characteristics, 0.87 of active site in protein functional characteristics, 0.9 of signal peptide transport efficiency, 0.25 of decomposition rate in water in ecological environment characteristics, and the data such as pathogenic gene/> in gene sequence characteristics is obtained after single-heat encoding treatment is further carried out on classification characteristicsExpression stability of expression signature ]Gene level transfer potential in ecological Environment characteristics/>Then; The four dimensions of the environmental ecological safety, the human health safety, the animal and plant safety and the agricultural food safety are adopted to evaluate the environment ecological safety, the human health safety, the animal and plant safety and the agricultural food safety to obtain the output valueNamely, the biosafety target data risk label Y 1=0.92,Y2=0.23,Y3=0.35,Y4 =0.17 indicates that the influence probability on the environmental ecology safety is 0.92, the influence probability on the human health safety is 0.23, the influence probability on the animal and plant safety is 0.35, and the influence probability on the agricultural food safety is 0.17.
In one possible embodiment, the dimension set by the biosafety target data risk tag is environmental ecological safety, human health safety, animal and plant safety, agricultural food safety; before the biometric security target data matching management mechanism based on the biometric security target data risk tag, the biometric security database management method further comprises:
S5031, comparing the biological safety target data risk tag data with a set threshold, and triggering a corresponding decision when the biological safety target data risk tag data is larger than the set threshold;
S5032, mapping each dimension value of the biosafety target data risk label to each dimension axis of the plane;
s5033, connecting adjacent data points to form a closed polygon, and generating a biological safety target data safety characteristic radar chart;
S5034, extracting a biological safety target data safety feature radar graph-shaped feature by using an image processing library; the image processing library has corresponding functions, such as convexity, angle and distance distribution, contour length, area surrounded by the contour, and the like, so that the recognition processing of the shape features can be performed, for example, contourArea functions in the OpenCV processing library can be used for calculating the area of the contour area, ARCLENGTH for calculating the contour perimeter, convexHull for searching the convex hull of the contour;
S5035, matching a specific decision according to the extracted biosafety target data safety feature radar graphic feature. By analysing the shape characteristics of the polygon, the state of equilibrium between the dimensions can be identified, and when the data value of a certain dimension is particularly high, the point on the corresponding axis will be closer to the edge, making the polygon more prominent in the direction of that axis. For example, if the graph is very prominent in the "environmental ecology safety" direction, this indicates that the organism is at a higher risk to environmental ecology safety than to other dimensions, and management decisions that focus on the environmental ecology safety aspects need to be taken.
Example 2: based on the same inventive concept, as shown in fig. 3, the present embodiment further provides a biosafety database management system, the system including:
a data acquisition module for acquiring biosafety target data, the biosafety target data comprising biosafety target elements;
The data classification module is used for classifying the biosafety target data to obtain biosafety target class data;
The path fitting module is used for acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into the biological reaction unit path structural model to obtain a biological safety target class data reaction unit path;
the event detection module is used for extracting a biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting a biosafety risk mark event on the biosafety target class data reaction unit path event to obtain biosafety target class first characteristic data;
The diagnosis decision module is used for acquiring second characteristics of the biological safety target class, inputting the first characteristic data and the second characteristic data of the biological safety target class into the safety risk diagnosis model to obtain a biological safety target data risk tag, and matching a management mechanism for the biological safety target data based on the biological safety target data risk tag.
It should be noted that, regarding the system in the above embodiment, the specific manner in which the respective modules perform the operations has been described in detail in the embodiment regarding the method, and will not be described in detail herein.
Finally, it should be noted that: although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims (10)

1. A method of biosafety database management, the method comprising:
acquiring biosafety target data, the biosafety target data comprising biosafety target elements;
classifying the biosafety target data to obtain biosafety target class data;
Acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into a biological reaction unit path structural model to obtain a biological safety target class data reaction unit path;
Extracting a biological safety target type data reaction unit path event from the biological safety target type data reaction unit path, and detecting a biological safety risk mark event on the biological safety target type data reaction unit path event to obtain biological safety target type first characteristic data;
the second characteristic of the biological safety target class is obtained, the first characteristic data and the second characteristic data of the biological safety target class are input into a safety risk diagnosis model to obtain a biological safety target data risk tag, and a management mechanism is matched with the biological safety target data based on the biological safety target data risk tag.
2. The method for managing a biosafety database according to claim 1, wherein said classifying the biosafety target data to obtain biosafety target class data includes:
performing data cleaning on the biosafety target data to obtain biosafety standardized text data;
The TF-IDF vectorizes the biological safety standardized text data to obtain biological safety omnidirectional quantized text features;
Classifying the biosafety vectorization text features according to an SVM algorithm to obtain biosafety target class data.
3. The method for managing a biosafety database according to claim 1, wherein the step of obtaining the action mechanism text information of the biosafety target class data, and loading the action mechanism text information of the biosafety target class data into a biosafety target class data reaction unit path structuring model to obtain a biosafety target class data reaction unit path, comprises:
defining biological safety target category data as biological safety target key word groups, and searching and obtaining literature text in a biological safety literature database according to the biological safety target key word groups;
Identifying key entities in a document text by using NER named entity identification technology, and processing the text by dependency syntax analysis to extract action mechanism key information fragments of biological safety target class data;
And loading the action mechanism key information fragment of the biological safety target class data into the biological reaction unit path structural model to obtain a biological safety target class data reaction unit path.
4. A biosafety database management method according to claim 3, wherein said bioreactor unit path structured model is a data structure capable of representing a bioreactor unit and its path, comprising bioreactor unit nodes and path edges;
The loading the action mechanism key information fragment of the biosafety target class data to the biological reaction unit path structured model to obtain a biological safety target class data reaction unit path comprises the following steps:
identifying an entity in the action mechanism key information of the biological safety target class data and mapping the entity into a biological reaction unit node in the biological reaction unit path structural model;
Mapping the action relation between entities in the action mechanism key information of the biological safety target class data to path edges in the biological reaction unit path structural model;
and constructing a reaction unit path of the biosafety target class according to the nodes and the path edges of the biological reaction unit mapped into the model.
5. The biosafety database management method of claim 1 wherein the biosafety target class data reaction unit path event is comprised of two biosafety target class data reaction unit nodes adjacent to each other and path edges thereof;
The biosafety risk mark event is a biosafety risk event list preset according to historical experience data;
the step of extracting the biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting the biosafety risk mark event to the biosafety target class data reaction unit path event to obtain biosafety target class first feature data comprises the steps of:
acquiring a biological safety target class data reaction unit path, and identifying all biological reaction unit nodes and path edges in the biological safety target class data reaction unit path;
Extracting path edges between adjacent biological reaction unit nodes to generate biological safety target class data reaction unit path events;
Matching the biological safety target type data reaction unit path event with the biological safety risk mark event to obtain a risk value corresponding to the biological safety target type data reaction unit path event;
and summarizing risk values corresponding to the biological safety target class data reaction unit path events to obtain first characteristic data of the biological safety target class.
6. The method for managing a biosafety database according to claim 1, wherein the acquiring the second feature of the biosafety target data, inputting the first feature data and the second feature data of the biosafety target class into a security risk diagnosis model to obtain a biosafety target data risk tag, and matching a management mechanism to the biosafety target data based on the biosafety target data risk tag comprises:
Acquiring a second characteristic of the biosafety target data;
normalizing the numerical characteristics in the first characteristic data and the second characteristic data of the biological safety target class, and performing independent-heat encoding on the classification characteristics in the first characteristic data and the second characteristic data of the biological safety target class to obtain a biological safety target class characteristic set;
loading the biological safety target class feature set to an input layer of a safety risk diagnosis model, and acquiring a biological safety target data risk tag from an output layer of the safety risk diagnosis model after the biological safety target class feature set is transmitted forwards through a hidden layer of the safety risk diagnosis model;
and matching and managing the biosafety target data based on the biosafety target data risk tag.
7. The method for managing a biosafety database of claim 6, wherein the second feature of acquiring biosafety target data is data acquisition by retrieving a knowledge base of basic features of biosafety target data, and the second feature of biosafety target data includes a gene sequence feature, an expression feature, a protein function feature, and an ecological environment feature.
8. The biosafety database management method of claim 6, wherein the security risk diagnosis model is a model for evaluating biosafety target data built based on a neural network algorithm, the security risk diagnosis model framework comprising:
taking the biosafety target class feature set as input of an input layer;
First, the Layer hidden layer activation function is set to/>
First, theThe layer is an output layer, and the output layer is provided withIndividual nodes respectively output the/>Evaluation value of individual dimension, output layer (s)/>The output of the individual node is set to/>
Compiling a model by cross entropy loss function and using gradient descent methodAnd/>Performing parameter fitting;
Wherein: the input layer biosafety target class feature set is as follows ,/>For/>Weight matrix of layer,/>For/>Bias vector of layer.
9. The biosafety database management method of claim 1 wherein the dimensions set by the biosafety target data risk tag are environmental ecology safety, human health safety, animal and plant safety, agricultural food safety; before the biometric security target data matching management mechanism based on the biometric security target data risk tag, the biometric security database management method further comprises:
Comparing the biological safety target data risk tag data with a set threshold value, and triggering a corresponding decision when the biological safety target data risk tag data is larger than the set threshold value;
Mapping each dimension value of the biological safety target data risk label to each dimension axis of the plane;
Connecting adjacent data points to form a closed polygon, and generating a biological safety target data safety characteristic radar chart;
extracting a biological safety target data safety feature radar graph-shaped feature by using an image processing library;
and matching a specific decision according to the extracted biosafety target data safety feature radar graph-shaped feature.
10. A biosafety database management system, the system comprising:
a data acquisition module for acquiring biosafety target data, the biosafety target data comprising biosafety target elements;
The data classification module is used for classifying the biosafety target data to obtain biosafety target class data;
The path fitting module is used for acquiring action mechanism key information fragments of the biological safety target class data according to the biological safety target class data, and loading the action mechanism key information fragments of the biological safety target class data into the biological reaction unit path structural model to obtain a biological safety target class data reaction unit path;
the event detection module is used for extracting a biosafety target class data reaction unit path event from the biosafety target class data reaction unit path, and detecting a biosafety risk mark event on the biosafety target class data reaction unit path event to obtain biosafety target class first characteristic data;
The diagnosis decision module is used for acquiring second characteristics of the biological safety target class, inputting the first characteristic data and the second characteristic data of the biological safety target class into the safety risk diagnosis model to obtain a biological safety target data risk tag, and matching a management mechanism for the biological safety target data based on the biological safety target data risk tag.
CN202410487657.9A 2024-04-23 2024-04-23 Biological safety database management method and system Pending CN118098377A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410487657.9A CN118098377A (en) 2024-04-23 2024-04-23 Biological safety database management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410487657.9A CN118098377A (en) 2024-04-23 2024-04-23 Biological safety database management method and system

Publications (1)

Publication Number Publication Date
CN118098377A true CN118098377A (en) 2024-05-28

Family

ID=91157294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410487657.9A Pending CN118098377A (en) 2024-04-23 2024-04-23 Biological safety database management method and system

Country Status (1)

Country Link
CN (1) CN118098377A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130041715A1 (en) * 2010-04-30 2013-02-14 Imaec Inc. Risk evaluation system using people as sensors
TW201441622A (en) * 2013-04-25 2014-11-01 Chia-Hung Liu Method for predicting post-treatment survival in HCC patient
CN110415764A (en) * 2019-07-25 2019-11-05 东南大学 The method and system and application of ceRNA mechanism are used using more data platforms discovery long-chain non-coding RNA molecular marker
CN116200260A (en) * 2023-05-06 2023-06-02 四川大学 Disposable biological safety bioreactor and monitoring method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130041715A1 (en) * 2010-04-30 2013-02-14 Imaec Inc. Risk evaluation system using people as sensors
TW201441622A (en) * 2013-04-25 2014-11-01 Chia-Hung Liu Method for predicting post-treatment survival in HCC patient
CN110415764A (en) * 2019-07-25 2019-11-05 东南大学 The method and system and application of ceRNA mechanism are used using more data platforms discovery long-chain non-coding RNA molecular marker
CN116200260A (en) * 2023-05-06 2023-06-02 四川大学 Disposable biological safety bioreactor and monitoring method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗杨;吴永贵;段志斌;谢荣;: "基于CiteSpace重金属生物可给性的文献计量分析", 农业环境科学学报, no. 01, 20 January 2020 (2020-01-20) *

Similar Documents

Publication Publication Date Title
Du et al. A prediction of precipitation data based on support vector machine and particle swarm optimization (PSO-SVM) algorithms
Le et al. Classification and explanation for intrusion detection system based on ensemble trees and SHAP method
Le et al. A computational framework based on ensemble deep neural networks for essential genes identification
Henriques et al. Combining k-means and xgboost models for anomaly detection using log datasets
Ghosh et al. Expert cancer model using supervised algorithms with a LASSO selection approach
Shekoofa et al. Determining the most important physiological and agronomic traits contributing to maize grain yield through machine learning algorithms: a new avenue in intelligent agriculture
Chasalow et al. Representativeness in statistics, politics, and machine learning
Mhawi et al. Advanced feature-selection-based hybrid ensemble learning algorithms for network intrusion detection systems
Zegeye et al. Multi-layer hidden Markov model based intrusion detection system
Yan et al. Unsupervised and semi‐supervised learning: The next frontier in machine learning for plant systems biology
Mazzei et al. Machine learning for industry 4.0: a systematic review using deep learning-based topic modelling
Howse et al. Bioclimatic modelling identifies suitable habitat for the establishment of the invasive European paper wasp (Hymenoptera: Vespidae) across the southern hemisphere
Hilbert The more you know, the more you can grow: an information theoretic approach to growth in the information age
Wang et al. MFDroid: A stacking ensemble learning framework for Android malware detection
Valdivia et al. Clustering-based binarization methods applied to the crow search algorithm for 0/1 combinatorial problems
Ali et al. A hybrid method for keystroke biometric user identification
Yuan et al. Hierarchical sampling for multi-instance ensemble learning
Kim et al. Effect of irrelevant variables on faulty wafer detection in semiconductor manufacturing
Jiang et al. DCiPatho: deep cross-fusion networks for genome scale identification of pathogens
Lee et al. Deep learning-based prediction of future growth potential of technologies
Ferretti et al. Do Neural Transformers Learn Human-Defined Concepts? An Extensive Study in Source Code Processing Domain
Choi et al. L-tree: a local-area-learning-based tree induction algorithm for image classification
Chu et al. Social-guided representation learning for images via deep heterogeneous hypergraph embedding
Koo et al. Attack graph generation with machine learning for network security
CN118098377A (en) Biological safety database management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination