CN116646011A - Construction method of high-throughput screening model, high-throughput screening method and related device - Google Patents

Construction method of high-throughput screening model, high-throughput screening method and related device Download PDF

Info

Publication number
CN116646011A
CN116646011A CN202310641728.1A CN202310641728A CN116646011A CN 116646011 A CN116646011 A CN 116646011A CN 202310641728 A CN202310641728 A CN 202310641728A CN 116646011 A CN116646011 A CN 116646011A
Authority
CN
China
Prior art keywords
plasmid
throughput screening
activity
tested
obtaining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310641728.1A
Other languages
Chinese (zh)
Inventor
陈洁洁
韦婷婷
虞盛松
俞汉青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202310641728.1A priority Critical patent/CN116646011A/en
Publication of CN116646011A publication Critical patent/CN116646011A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Organic Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Evolutionary Computation (AREA)
  • Analytical Chemistry (AREA)
  • Software Systems (AREA)
  • Biochemistry (AREA)
  • Bioethics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Public Health (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)

Abstract

The embodiment of the application discloses a construction method of a high-throughput screening model, a high-throughput screening method and a related device, wherein a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacterium plasmid is firstly obtained as a target plasmid, and the target plasmid spliced with an anchor protein gene and a plasmid frame is obtained as a first plasmid assembly fragment. And then obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculating the activity of the first plasmid to be tested, so as to screen the first plasmid to be tested, the activity of which meets the activity standard, as a plasmid to be studied. And acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model. The high-throughput screening model obtained by combining the polypeptide sequence data calculation, the activity data calculation and the machine learning can greatly reduce the workload required by screening mutant libraries and improve the target active polypeptide screening success rate.

Description

Construction method of high-throughput screening model, high-throughput screening method and related device
Technical Field
The application relates to the technical field of high-throughput screening, in particular to a method for constructing a high-throughput screening model, a high-throughput screening method and a related device.
Background
Protein engineering is one of the effective means for improving the stability, tolerance and selectivity of natural proteins. The directed evolution of protein is to make multiple rounds of mutation, expression and screening on target gene by various experimental techniques, simulate natural evolution mechanism in vitro and accelerate the accumulation and selection of new mutation, thus obtaining engineering of protein with specific properties and functions.
The sequences of polypeptides and small molecule proteins are typically within 100 amino acids, and the spatial structure is less complex, facilitating the acquisition of the desired information by computational simulation. Even so, the effort of sequence mutation and activity screening for these sequences is still enormous. The current chemical method for optimizing the activity of the polypeptide has low flux and high cost; conventional biological methods are not only cumbersome in purification steps but also rely on the insight into the complex relationships of protein sequence-structure-function.
Therefore, how to provide a model capable of rapidly calculating the activity of a polypeptide and screening out a polypeptide with better activity at high throughput is a technical problem which needs to be solved by those skilled in the art.
Disclosure of Invention
Based on the problems, the application provides a high-throughput screening model construction method, a high-throughput screening method and a related device, so that a model capable of rapidly calculating the activity of polypeptides and screening out polypeptides with better activity at high throughput is provided. The embodiment of the application discloses the following technical scheme:
compared with the prior art, the application has the following beneficial effects:
a method of constructing a high throughput screening model, the method comprising:
obtaining a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacterium plasmid;
obtaining a first plasmid assembly fragment, wherein the first plasmid assembly fragment is a target plasmid spliced with an anchor protein gene and a plasmid frame;
obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculating the activity of the first plasmid to be tested, so as to screen the first plasmid to be tested, the activity of which meets the activity standard, as a plasmid to be studied;
and acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-flux screening model.
In one possible implementation, the obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested comprises:
obtaining the first plasmid assembly fragment converted into competence as a second plasmid to be detected;
obtaining a sequencing result of the second plasmid to be tested as a sequencing result of the first plasmid;
screening out a third plasmid to be detected according to the sequencing result of the first plasmid;
obtaining the third plasmid to be detected after induced expression as a fourth plasmid to be detected;
and taking the fourth plasmid to be tested, the absorbance of which meets the first preset standard, as the first plasmid to be tested.
In one possible implementation, the obtaining the sequencing result of the second plasmid to be tested as the sequencing result of the first plasmid includes:
obtaining a monoclonal colony subjected to bacterial culture in the second plasmid to be detected as a first plasmid to be cultivated;
and carrying out PCR verification on the first plasmid to be cultivated, screening out positive bacteria, carrying out plasmid sequencing to obtain the amino acid type of each positive bacteria, and taking the amino acid type of the positive bacteria as the sequencing result of the first plasmid.
In one possible implementation, the medium used in performing the induced expression is 200. Mu.l LB medium containing antibiotic and inducer.
In one possible implementation, the medium used in carrying out the bacterial culture is 100 microliters of LB medium containing antibiotics.
In one possible implementation, before the obtaining the target plasmid, the method further includes:
obtaining a second plasmid assembly fragment, wherein the second plasmid assembly fragment is a polypeptide engineering bacteria plasmid spliced with the anchoring protein gene and the plasmid frame;
obtaining the second plasmid assembly fragment converted into competence as a second plasmid to be cultured;
obtaining a sequencing result of a monoclonal colony subjected to bacterial culture in the second plasmid to be cultured as a second plasmid sequencing result;
and taking the second plasmid to be cultured, of which the sequencing result meets a second preset standard, as the target plasmid.
In one possible implementation, both the plasmid of interest and the dockerin gene are codon optimized for E.coli.
A high throughput screening method, the method comprising:
obtaining sequence data of a polypeptide engineering bacteria plasmid;
inputting the sequence data into a high-throughput screening model, so as to screen out the polypeptide engineering bacteria plasmid with the activity meeting a second preset standard, wherein the high-throughput screening model is constructed according to the construction method of the high-throughput screening model.
A device for constructing a high throughput screening model, the device comprising:
the first acquisition unit is used for acquiring a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacteria plasmid;
the second acquisition unit is used for acquiring a first plasmid assembly fragment, wherein the first plasmid assembly fragment is a target plasmid spliced with the anchor protein gene and the plasmid frame;
a third obtaining unit, configured to obtain the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculate activity of the first plasmid to be tested, thereby screening out a plasmid to be tested that meets an activity standard as a plasmid to be studied;
and the fourth acquisition unit is used for acquiring the sequence data and the activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model.
A high throughput screening apparatus, the apparatus comprising:
a fifth acquisition unit for acquiring sequence data of the polypeptide engineering bacteria plasmid;
the first screening unit is used for inputting the sequence data into a high-throughput screening model, so as to screen out the polypeptide engineering bacteria plasmid with the activity meeting the second preset standard, wherein the high-throughput screening model is constructed according to the construction method of the high-throughput screening model.
The application provides a construction method of a high-throughput screening model, a high-throughput screening method and a related device. Specifically, when the method for constructing the high-throughput screening model provided by the embodiment of the application is executed, a target plasmid can be obtained first, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacteria plasmid. The target plasmid with the segment spliced with the ankyrin gene and the plasmid frame was obtained as the first plasmid assembly sheet. The pretreated first plasmid assembly fragment is then obtained as a first plasmid to be tested. Then, the activity of the first plasmid to be tested is calculated, so that the first plasmid to be tested, the activity of which meets the activity standard, is selected as the plasmid to be studied. And finally, acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data of the plasmid to be learned into a neural network for learning to obtain a high-throughput screening model. The high-throughput screening model obtained by combining the polypeptide sequence data calculation, the activity data calculation and the machine learning can greatly reduce the workload required by screening mutant libraries and improve the target active polypeptide screening success rate.
Drawings
In order to more clearly illustrate this embodiment or the technical solutions of the prior art, the drawings that are required for the description of the embodiment or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a high throughput screening model according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for high throughput screening according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a device for constructing a high-throughput screening model according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a high throughput screening apparatus according to an embodiment of the present application.
Detailed Description
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In order to facilitate understanding of the technical solution provided by the embodiments of the present application, the following description will first explain the background technology related to the embodiments of the present application.
Protein engineering is one of the effective means for improving the stability, tolerance and selectivity of natural proteins. In the 80 s of the 20 th century, reasonable design of proteins was very difficult due to the lack of structural biological information and the complexity of protein structure, and development of protein engineering was severely restricted. The development of directed evolution of proteins widens the design range of protein engineering. The directed evolution of protein is to make multiple rounds of mutation, expression and screening on target gene by various experimental techniques, simulate natural evolution mechanism in vitro, accelerate the accumulation and selection of new mutation and obtain protein with specific properties and functions.
The sequences of polypeptides and small molecule proteins are typically within 100 amino acids, and the spatial structure is less complex, facilitating the acquisition of the desired information by computational simulation. Even so, the effort of sequence mutation and activity screening for these sequences is still enormous. The current chemical method for optimizing the activity of the polypeptide has low flux and high cost; conventional biological methods are not only cumbersome in purification steps but also rely on the insight into the complex relationships of protein sequence-structure-function.
In order to solve the problem, the embodiment of the application provides a construction method of a high-throughput screening model, a high-throughput screening method and a related device, wherein a target plasmid is firstly obtained, and the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacteria plasmid. Then, a first plasmid assembly fragment, which is a target plasmid spliced with the ankyrin gene and the plasmid frame, is obtained. Then, the pretreated first plasmid assembly fragment is obtained as a first plasmid to be tested, and the activity of the first plasmid to be tested is calculated, so that the first plasmid to be tested, the activity of which meets the activity standard, is selected as the plasmid to be studied. And finally, acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data of the plasmid to be learned into a neural network for learning to obtain a high-throughput screening model. The high-throughput screening model obtained by combining the polypeptide sequence data calculation, the activity data calculation and the machine learning can greatly reduce the workload required by screening mutant libraries and improve the target active polypeptide screening success rate.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, the method flowchart of a method for constructing a high-throughput screening model according to an embodiment of the present application, as shown in fig. 1, the method for constructing a high-throughput screening model may include steps S101 to S104:
s101: and obtaining a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of the polypeptide engineering bacteria plasmid.
In order to construct a high-throughput screening model, a construction system of the high-throughput screening model can firstly acquire a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacterium plasmid.
In one possible implementation, the specific primer designed at the mutation site of the polypeptide engineering bacteria plasmid may be, but is not limited to, calculating structural data of the polypeptide engineering bacteria plasmid by Rosetta, and taking the structural data of the polypeptide engineering bacteria plasmid as one of input data of ddg _mono program. And writing a full-mutation scanning file required by ddg _monoer according to the amino acid sequence of the polypeptide engineering bacteria plasmid. And (3) executing a ddg _monoer program, sorting output results according to sites, and selecting each site with higher beneficial mutation rate. NNK saturation mutation primers are designed at each site with higher beneficial mutation rate. For NNK site-saturation mutagenesis primers, the annealing temperature of the high-fidelity enzyme was varied using a general PCR procedure to amplify the desired fragment. Wherein the annealing temperature of the primer is between 56 and 58 ℃.
In one possible implementation, before the obtaining the target plasmid, A1-A4 are further included:
a1: and obtaining a second plasmid assembly fragment, wherein the second plasmid assembly fragment is a polypeptide engineering bacteria plasmid spliced with the ankyrin gene and the plasmid frame.
Only the polypeptide engineering bacteria plasmid meeting the activity standard can be used as a target plasmid to predict sequence data and activity data, and in order to screen the polypeptide engineering bacteria plasmid meeting the activity standard, a construction system of a high-throughput screening model needs to obtain the polypeptide engineering bacteria plasmid spliced with the ankyrin gene and the plasmid frame as a second plasmid assembly fragment.
In one possible implementation, the plasmid frame may be, but is not limited to, a plasmid frame with a specific resistance gene, such as IPTG-induced expression of the kanamycin-resistant plasmid frame pYYDT.
A2: and obtaining the second plasmid assembly fragment converted into competence as a second plasmid to be cultured.
In the case of artificially constructed plasmid vectors, mob gene necessary for transfer is lacking, and the conjunctive transfer from one cell to another cannot be accomplished by itself. It is therefore necessary to transfer the second plasmid assembly fragment into the recipient bacterium, and to induce the recipient bacterium to produce a transient competence (i.e., the second plasmid to be cultured) for uptake of the exogenous DNA.
In one possible implementation, the competent bacteria used in the transformation to competence are E.coli.
A3: and obtaining the sequencing result of the monoclonal colony subjected to bacterial culture in the second plasmid to be cultured as a second plasmid sequencing result.
In order to screen out the target plasmid meeting the activity standard, the high-throughput screening model construction system also needs to obtain the sequencing result of the bacterial cultured monoclonal colony in the second plasmid to be cultured.
A4: and taking the second plasmid to be cultured, of which the sequencing result meets a second preset standard, as the target plasmid.
The second plasmid to be cultured, the sequencing result of which meets the second preset standard, can be used as a target plasmid to prepare for constructing a high-throughput screening model.
In one possible implementation manner, the fact that the second plasmid sequencing result meets the second preset standard means that the size and sequence of the second plasmid to be cultured obtained by sequencing meet the size and sequence of a preset product, wherein the size and sequence of the preset product can be adjusted according to the requirement of a user, and the size and sequence of the preset product are not limited in detail.
In one possible implementation, the plasmid of interest has been codon optimized for E.coli.
In one possible implementation, both the plasmid of interest and the dockerin gene are codon optimized for E.coli. Codon optimization of the target plasmid and the dockerin group in order to make the expressed target plasmid and dockerin group more, their active effects are more obvious.
S102: a first plasmid assembly fragment is obtained, which is a target plasmid spliced with the ankyrin gene and the plasmid frame.
After the target plasmid (i.e., the plasmid with the specific primers designed at the mutation sites of the polypeptide engineering bacteria plasmid) is obtained, the construction system of the high throughput screening model also needs to obtain the target plasmid spliced with the ankyrin gene and the plasmid frame as the first plasmid assembly fragment.
In one possible implementation, the first plasmid assembly fragment is spliced to the ankyrin gene and plasmid frame by a Gibson assembly molecular cloning technique.
In one possible implementation, the ankyrin gene has been codon optimized for E.coli. Codon optimization was performed in order to make the ankyrin gene more.
S103: and obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculating the activity of the first plasmid to be tested, so as to screen the first plasmid to be tested, the activity of which meets the activity standard, as a plasmid to be studied.
In order to obtain sequence data and activity data of plasmids which can be input into the neural network, after the first plasmid assembly segment is obtained, the construction system of the high-throughput screening model also needs to obtain the pretreated first plasmid assembly segment as a first plasmid to be tested, and calculate the activity of the first plasmid to be tested, so as to screen the first plasmid to be tested, the activity of which meets the activity standard, as a plasmid to be studied.
In one possible implementation, the obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested comprises B1-B5:
b1: and obtaining the first plasmid assembly fragment converted into competence as a second plasmid to be tested.
In the case of artificially constructed plasmid vectors, mob gene necessary for transfer is lacking, and the conjunctive transfer from one cell to another cannot be accomplished by itself. It is therefore necessary to transfer the first plasmid assembly fragment into the recipient bacterium, and to induce the recipient bacterium to produce a transient competence (i.e., the second test plasmid) for uptake of the exogenous DNA.
In one possible implementation, the competent bacteria used in the transformation to competence are E.coli.
B2: and obtaining the sequencing result of the second plasmid to be tested as the sequencing result of the first plasmid.
After the second plasmid to be tested is obtained, the high-throughput screening model construction system also needs to obtain the sequencing result of the second plasmid to be tested as the sequencing result of the first plasmid.
In one possible implementation, the obtaining the sequencing result of the second test plasmid as the first plasmid sequencing result includes B21-B22:
b21: and obtaining a monoclonal colony subjected to bacterial culture in the second plasmid to be detected as a first plasmid to be cultivated.
In order to obtain the sequencing result of the second plasmid to be tested, the construction system of the high-throughput screening model needs to obtain the monoclonal colony of the second plasmid to be tested, which is subjected to bacterial culture, as the first plasmid to be tested, so as to prepare for subsequent culture.
B22: and carrying out PCR verification on the first plasmid to be cultivated, screening out positive bacteria, carrying out plasmid sequencing to obtain the amino acid type of each positive bacteria, and taking the amino acid type of the positive bacteria as the sequencing result of the first plasmid.
After the first plasmid to be cultured is obtained, the first plasmid to be cultured is required to be subjected to PCR verification, the verified first plasmid to be cultured is screened, positive bacteria are screened out, plasmid sequencing is carried out to obtain the amino acid types of each positive bacteria, and the amino acid types of the positive bacteria are used as a first plasmid sequencing result to prepare for subsequent verification.
In one possible implementation, the medium used in carrying out the bacterial culture is 100 microliters of LB medium containing antibiotics.
B3: and screening out a third plasmid to be tested according to the sequencing result of the first plasmid.
After the first plasmid sequencing result is obtained, the construction system of the high-throughput screening model can screen the third plasmid to be tested according to the first plasmid sequencing result.
In one possible implementation, the screening the third test plasmid according to the sequencing result of the first plasmid includes: screening out a positive bacterium from positive bacteria of different amino acid types, and respectively placing the positive bacteria of each type into a corresponding micro-pore plate. The microwell plate may be, but is not limited to, a 96 microwell plate.
B4: and obtaining the third plasmid to be tested after induced expression as a fourth plasmid to be tested.
After the third plasmid to be tested is screened, the construction system of the high-throughput screening model can obtain the third plasmid to be tested which is subjected to induced expression as a fourth plasmid to be tested.
In one possible implementation, the medium used in performing the induced expression is 200. Mu.l LB medium containing antibiotic and inducer. The LB culture medium is a name of a culture medium, and is generally used for preculturing strains in biochemical molecular experiments, so that the strains are amplified in multiple and meet the use requirement.
B5: and taking the fourth plasmid to be tested, the absorbance of which meets the first preset standard, as the first plasmid to be tested.
After obtaining the third plasmid to be tested, i.e., the fourth plasmid to be tested after induced expression, the construction system of the high-throughput screening model also needs to take the fourth plasmid to be tested, of which absorbance accords with the first preset standard, as the first plasmid to be tested.
In one possible implementation, the absorbance meeting the first preset standard means that the absorbance of the fourth plasmid to be tested is stable and unchanged. The absorbance of the fourth plasmid to be tested can be detected by an enzyme-labeled instrument.
S104: and acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-flux screening model.
After the plasmid to be learned is obtained, the construction system of the flux screening model can obtain the sequence data and the activity data of the plasmid to be learned, and the sequence data and the activity data are input into a neural network for learning to obtain the high flux screening model.
In one possible implementation manner, the relative activity of the mutant to the wild-type polypeptide can be calculated by (absorbance of the fourth plasmid to be tested-absorbance of the induced expression liquid)/(absorbance of the wild-type polypeptide-absorbance of the induced expression liquid), and then the absolute activity value (i.e., activity data) of the fourth plasmid to be tested is obtained by conversion according to the activity value of the wild-type polypeptide.
Based on the content of S101-S104, the target plasmid is firstly obtained, and the target plasmid is a plasmid with a specific primer designed at the mutation site of the polypeptide engineering bacteria plasmid. Then, a first plasmid assembly fragment, which is a target plasmid spliced with the ankyrin gene and the plasmid frame, is obtained. Then, the pretreated first plasmid assembly fragment is obtained as a first plasmid to be tested, and the activity of the first plasmid to be tested is calculated, so that the first plasmid to be tested, the activity of which meets the activity standard, is selected as the plasmid to be studied. And finally, acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model. The high-throughput screening model obtained by combining the polypeptide sequence data calculation, the activity data calculation and the machine learning can greatly reduce the workload required by screening mutant libraries and improve the target active polypeptide screening success rate.
Based on the embodiment of the method for constructing the high-throughput screening model, the embodiment of the application also provides a high-throughput screening method. Referring to fig. 2, fig. 2 is a flowchart of a method for high throughput screening according to an embodiment of the present application. As shown in fig. 2, the method includes S201 to S202:
s201: and obtaining sequence data of the polypeptide engineering bacteria plasmid.
In order to rapidly screen out the polypeptide engineering bacteria plasmid with better activity, the high-throughput screening system needs to obtain the sequence data of the polypeptide engineering bacteria plasmid.
S202: inputting the sequence data into a high-throughput screening model, so as to screen out the polypeptide engineering bacteria plasmid with the activity meeting a second preset standard, wherein the high-throughput screening model is constructed by the construction method of the high-throughput screening model.
After the polypeptide engineering bacteria plasmid is obtained, the high-throughput screening system can input the sequence data into a high-throughput screening model, so that the polypeptide engineering bacteria plasmid with the activity meeting the second preset standard is screened out.
In one possible implementation, an activity meeting the second predetermined criterion means that the relative activity is good and can function effectively.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a device for constructing a high throughput screening model according to an embodiment of the present application. As shown in fig. 3, the apparatus for constructing a high throughput screening model includes:
a first obtaining unit 301, configured to obtain a target plasmid, where the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacteria plasmid.
In one possible implementation, the specific primer designed at the mutation site of the polypeptide engineering bacteria plasmid may be, but is not limited to, calculating structural data of the polypeptide engineering bacteria plasmid by Rosetta, and taking the structural data of the polypeptide engineering bacteria plasmid as one of input data of ddg _mono program. And writing a full-mutation scanning file required by ddg _monoer according to the amino acid sequence of the polypeptide engineering bacteria plasmid. And (3) executing a ddg _monoer program, sorting output results according to sites, and selecting each site with higher beneficial mutation rate. NNK saturation mutation primers are designed at each site with higher beneficial mutation rate. For NNK site-saturation mutagenesis primers, the annealing temperature of the high-fidelity enzyme was varied using a general PCR procedure to amplify the desired fragment. Wherein the annealing temperature of the primer is between 56 and 58 ℃.
In one possible implementation, the apparatus further includes:
and a sixth acquisition unit, configured to acquire a second plasmid assembly fragment, where the second plasmid assembly fragment is a polypeptide engineering bacteria plasmid spliced with the ankyrin gene and the plasmid frame.
In one possible implementation, the plasmid frame may be, but is not limited to, a plasmid frame with a specific resistance gene, such as IPTG-induced expression of the kanamycin-resistant plasmid frame pYYDT.
A seventh obtaining unit for obtaining the second plasmid assembly fragment transformed to competence as a second plasmid to be cultured.
In one possible implementation, the competent bacteria used in the transformation to competence are E.coli.
And an eighth acquisition unit for acquiring a sequencing result of the bacterial-cultured monoclonal colony in the second plasmid to be cultured as a second plasmid sequencing result.
And the second screening unit is used for taking the second plasmid to be cultured, of which the sequencing result meets a second preset standard, as the target plasmid.
In one possible implementation manner, the fact that the second plasmid sequencing result meets the second preset standard means that the size and sequence of the second plasmid to be cultured obtained by sequencing meet the size and sequence of a preset product, wherein the size and sequence of the preset product can be adjusted according to the requirement of a user, and the size and sequence of the preset product are not limited in detail.
In one possible implementation, the plasmid of interest has been codon optimized for E.coli.
In one possible implementation, both the plasmid of interest and the dockerin gene are codon optimized for E.coli. Codon optimization of the target plasmid and the dockerin group in order to make the expressed target plasmid and dockerin group more, their active effects are more obvious.
A second obtaining unit 302, configured to obtain a first plasmid assembly fragment, where the first plasmid assembly fragment is a target plasmid with the dockerin gene and the plasmid frame spliced.
In one possible implementation, the first plasmid assembly fragment is spliced to the ankyrin gene and plasmid frame by a Gibson assembly molecular cloning technique.
In one possible implementation, the ankyrin gene has been codon optimized for E.coli. Codon optimization was performed in order to make the ankyrin gene more.
And a third obtaining unit 303, configured to obtain the pretreated assembled fragment of the first plasmid as a first plasmid to be tested, and calculate the activity of the first plasmid to be tested, thereby screening out plasmids to be tested meeting the activity standard as plasmids to be studied.
In one possible implementation, the apparatus further includes:
and a ninth acquisition unit for acquiring the first plasmid assembly fragment converted to competence as a second plasmid to be tested.
In one possible implementation, the competent bacteria used in the transformation to competence are E.coli.
And a tenth acquisition unit, configured to acquire a sequencing result of the second plasmid to be tested as a sequencing result of the first plasmid.
In one possible implementation, the apparatus further includes:
an eleventh obtaining unit for obtaining a bacterial-cultured monoclonal colony in the second plasmid to be tested as a first plasmid to be tested.
And the third screening unit is used for carrying out PCR verification on the first plasmid to be cultivated, screening out positive bacteria, carrying out plasmid sequencing to obtain the amino acid type of each positive bacteria, and taking the amino acid type of the positive bacteria as the sequencing result of the first plasmid.
In one possible implementation, the medium used in carrying out the bacterial culture is 100 microliters of LB medium containing antibiotics.
And the fourth screening unit is used for screening out a third plasmid to be detected according to the sequencing result of the first plasmid.
In one possible implementation, the screening the third test plasmid according to the sequencing result of the first plasmid includes: screening out a positive bacterium from positive bacteria of different amino acid types, and respectively placing the positive bacteria of each type into a corresponding micro-pore plate. The microwell plate may be, but is not limited to, a 96 microwell plate.
A twelfth obtaining unit for obtaining the third plasmid to be tested after induced expression as a fourth plasmid to be tested.
In one possible implementation, the medium used in performing the induced expression is 200. Mu.l LB medium containing antibiotic and inducer. The LB culture medium is a name of a culture medium, and is generally used for preculturing strains in biochemical molecular experiments, so that the strains are amplified in multiple and meet the use requirement.
And a fifth screening unit, configured to use the fourth plasmid to be tested whose absorbance meets a first preset standard as the first plasmid to be tested.
In one possible implementation, the absorbance meeting the first preset standard means that the absorbance of the fourth plasmid to be tested is stable and unchanged. The absorbance of the fourth plasmid to be tested can be detected by an enzyme-labeled instrument.
And a fourth obtaining unit 304, configured to obtain sequence data and activity data of the plasmid to be learned, and input the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model.
In one possible implementation manner, the relative activity of the mutant to the wild-type polypeptide can be calculated by (absorbance of the fourth plasmid to be tested-absorbance of the induced expression liquid)/(absorbance of the wild-type polypeptide-absorbance of the induced expression liquid), and then the absolute activity value (i.e., activity data) of the fourth plasmid to be tested is obtained by conversion according to the activity value of the wild-type polypeptide.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a high throughput screening apparatus according to an embodiment of the present application. As shown in fig. 4, the high throughput screening apparatus includes:
a fifth obtaining unit 401, configured to obtain sequence data of the polypeptide engineering bacteria plasmid.
The first screening unit 402 is configured to input the sequence data into a high-throughput screening model, so as to screen out a polypeptide engineering bacterial plasmid with activity meeting a second preset standard, where the high-throughput screening model is constructed according to the construction method of the high-throughput screening model as described above.
In one possible implementation, an activity meeting the second predetermined criterion means that the relative activity is good and can be effectively affected.
The embodiment of the application provides a method for constructing a high-throughput screening model, a high-throughput screening method and a related device, wherein the method for constructing the high-throughput screening model comprises the following steps: and obtaining a target plasmid with a specific primer designed at a mutation site of the polypeptide engineering bacteria plasmid, and obtaining the target plasmid spliced with the ankyrin gene and the plasmid frame as a first plasmid assembly fragment. Next, the pretreated first plasmid assembly fragment is obtained as a first plasmid to be tested. Calculating the activity of the first plasmid to be tested, and screening the first plasmid to be tested, the activity of which meets the activity standard, as the plasmid to be studied. And acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model. The high-throughput screening model obtained by combining the polypeptide sequence data calculation, the activity data calculation and the machine learning can greatly reduce the workload required by screening mutant libraries and improve the target active polypeptide screening success rate.
The construction method, the high-throughput screening method and the related devices of the high-throughput screening model provided by the application are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of constructing a high throughput screening model, the method comprising:
obtaining a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacterium plasmid;
obtaining a first plasmid assembly fragment, wherein the first plasmid assembly fragment is a target plasmid spliced with an anchor protein gene and a plasmid frame;
obtaining the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculating the activity of the first plasmid to be tested, so as to screen the first plasmid to be tested, the activity of which meets the activity standard, as a plasmid to be studied;
and acquiring sequence data and activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-flux screening model.
2. The method of claim 1, wherein said obtaining the pretreated first plasmid assembly fragment as a first test plasmid comprises:
obtaining the first plasmid assembly fragment converted into competence as a second plasmid to be detected;
obtaining a sequencing result of the second plasmid to be tested as a sequencing result of the first plasmid;
screening out a third plasmid to be detected according to the sequencing result of the first plasmid;
obtaining the third plasmid to be detected after induced expression as a fourth plasmid to be detected;
and taking the fourth plasmid to be tested, the absorbance of which meets the first preset standard, as the first plasmid to be tested.
3. The method of claim 2, wherein the obtaining the sequencing result of the second test plasmid as the first plasmid sequencing result comprises:
obtaining a monoclonal colony subjected to bacterial culture in the second plasmid to be detected as a first plasmid to be cultivated;
and carrying out PCR verification on the first plasmid to be cultivated, screening out positive bacteria, carrying out plasmid sequencing to obtain the amino acid type of each positive bacteria, and taking the amino acid type of the positive bacteria as the sequencing result of the first plasmid.
4. The method according to claim 2, wherein the medium used in carrying out the induced expression is 200. Mu.l of LB medium containing an antibiotic and an inducer.
5. A method according to claim 3, wherein the culture medium used in the cultivation of the bacteria is 100 μl of LB medium containing antibiotics.
6. The method of claim 1, further comprising, prior to said obtaining the plasmid of interest:
obtaining a second plasmid assembly fragment, wherein the second plasmid assembly fragment is a polypeptide engineering bacteria plasmid spliced with the anchoring protein gene and the plasmid frame;
obtaining the second plasmid assembly fragment converted into competence as a second plasmid to be cultured;
obtaining a sequencing result of a monoclonal colony subjected to bacterial culture in the second plasmid to be cultured as a second plasmid sequencing result;
and taking the second plasmid to be cultured, of which the sequencing result meets a second preset standard, as the target plasmid.
7. The method of claim 1 or 6, wherein both the plasmid of interest and the dockerin gene are codon optimized for e.
8. A high throughput screening method, the method comprising:
obtaining sequence data of a polypeptide engineering bacteria plasmid;
inputting the sequence data into a high-throughput screening model, so as to screen out the polypeptide engineering bacteria plasmid with the activity meeting a second preset standard, wherein the high-throughput screening model is constructed according to the construction method of the high-throughput screening model of any one of claims 1-7.
9. A device for constructing a high throughput screening model, the device comprising:
the first acquisition unit is used for acquiring a target plasmid, wherein the target plasmid is a plasmid with a specific primer designed at a mutation site of a polypeptide engineering bacteria plasmid;
the second acquisition unit is used for acquiring a first plasmid assembly fragment, wherein the first plasmid assembly fragment is a target plasmid spliced with the anchor protein gene and the plasmid frame;
a third obtaining unit, configured to obtain the pretreated first plasmid assembly fragment as a first plasmid to be tested, and calculate activity of the first plasmid to be tested, thereby screening out a plasmid to be tested that meets an activity standard as a plasmid to be studied;
and the fourth acquisition unit is used for acquiring the sequence data and the activity data of the plasmid to be learned, and inputting the sequence data and the activity data into a neural network for learning to obtain a high-throughput screening model.
10. A high throughput screening apparatus, the apparatus comprising:
a fifth acquisition unit for acquiring sequence data of the polypeptide engineering bacteria plasmid;
the first screening unit is used for inputting the sequence data into a high-throughput screening model, so as to screen out the polypeptide engineering bacteria plasmid with the activity meeting the second preset standard, wherein the high-throughput screening model is constructed according to the construction method of the high-throughput screening model of any one of claims 1-7.
CN202310641728.1A 2023-05-31 2023-05-31 Construction method of high-throughput screening model, high-throughput screening method and related device Pending CN116646011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310641728.1A CN116646011A (en) 2023-05-31 2023-05-31 Construction method of high-throughput screening model, high-throughput screening method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310641728.1A CN116646011A (en) 2023-05-31 2023-05-31 Construction method of high-throughput screening model, high-throughput screening method and related device

Publications (1)

Publication Number Publication Date
CN116646011A true CN116646011A (en) 2023-08-25

Family

ID=87618529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310641728.1A Pending CN116646011A (en) 2023-05-31 2023-05-31 Construction method of high-throughput screening model, high-throughput screening method and related device

Country Status (1)

Country Link
CN (1) CN116646011A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117219168A (en) * 2023-11-09 2023-12-12 中国农业科学院蜜蜂研究所 Rapid screening method of high-activity alpha-glycosidase inhibitory peptide and application thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117219168A (en) * 2023-11-09 2023-12-12 中国农业科学院蜜蜂研究所 Rapid screening method of high-activity alpha-glycosidase inhibitory peptide and application thereof
CN117219168B (en) * 2023-11-09 2024-03-08 中国农业科学院蜜蜂研究所 Rapid screening method of high-activity alpha-glycosidase inhibitory peptide and application thereof

Similar Documents

Publication Publication Date Title
de Boer et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters
Kotopka et al. Model-driven generation of artificial yeast promoters
Moore et al. Rapid acquisition and model-based analysis of cell-free transcription–translation reactions from nonmodel bacteria
Si et al. Automated multiplex genome-scale engineering in yeast
CN101175847B (en) Method for improving a strain based on in-silico analysis
Lee et al. Metabolic engineering
Yang et al. Prediction and characterization of promoters and ribosomal binding sites of Zymomonas mobilis in system biology era
Toh-E et al. Plasmids resembling 2-μm DNA in the osmotolerant yeasts Saccharomyces bailii and Saccharomyces bisporus
Gurdo et al. Merging automation and fundamental discovery into the design–build–test–learn cycle of nontraditional microbes
CN116646011A (en) Construction method of high-throughput screening model, high-throughput screening method and related device
Kong et al. Characterization of a panel of strong constitutive promoters from Streptococcus thermophilus for fine-tuning gene expression
Yanofsky Using studies on tryptophan metabolism to answer basic biological questions
Halleran et al. Single day construction of multigene circuits with 3G assembly
Zucca et al. Characterization of an inducible promoter in different DNA copy number conditions
Freed et al. Genome-wide tuning of protein expression levels to rapidly engineer microbial traits
Miton et al. Statistical analysis of mutational epistasis to reveal intramolecular interaction networks in proteins
Belda et al. Seeding the idea of encapsulating a representative synthetic metagenome in a single yeast cell
Payen et al. The renaissance of yeasts as microbial factories in the modern age of biomanufacturing
Van Dien Metabolic engineering for bioprocess commercialization
Munro et al. Intelligent host engineering for metabolic flux optimisation in biotechnology
Peng et al. A molecular toolkit of cross-feeding strains for engineering synthetic yeast communities
Yehezkel et al. Computer-aided high-throughput cloning of bacteria in liquid medium
Vaishnav et al. A comprehensive fitness landscape model reveals the evolutionary history and future evolvability of eukaryotic cis-regulatory DNA sequences
Appel et al. uPIC–M: efficient and scalable preparation of clonal single mutant libraries for high-throughput protein biochemistry
Cankorur-Cetinkaya et al. CamOptimus: a tool for exploiting complex adaptive evolution to optimize experiments and processes in biotechnology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination