CN117783088B

CN117783088B - Control model training method, device and equipment of laser micro-Raman spectrometer

Info

Publication number: CN117783088B
Application number: CN202410202454.0A
Authority: CN
Inventors: 梁世健; 邱迅; 邓嘉迪
Original assignee: Guangzhou Betop Scientific Ltd
Current assignee: Guangzhou Betop Scientific Ltd
Priority date: 2024-02-23
Filing date: 2024-02-23
Publication date: 2024-05-14
Anticipated expiration: 2044-02-23
Also published as: CN117783088A

Abstract

The invention belongs to the technical field of intelligent control, and discloses a control model training method, a device and equipment of a laser microscopic Raman spectrometer.

Description

Control model training method, device and equipment of laser micro-Raman spectrometer

Technical Field

The invention belongs to the technical field of intelligent control, and particularly relates to a control model training method, device and equipment of a laser micro-Raman spectrometer.

Background

The laser microscopic Raman spectrometer (Raman spectrometer for short) is commonly used for obtaining the molecular vibration information of the sample and analyzing the molecular vibration information to obtain Raman spectrum data, and has a decisive effect on the physicochemical property analysis of the sample. However, the quality of the acquired raman spectrum data is generally affected by the setting of the scanning parameters of the raman spectrometer, including laser power, integration time, step distance, etc., and in practical applications, the adjustment of these scanning parameters usually needs to be manually operated, which is very dependent on experience and intuition of an operator, which is time-consuming and labor-consuming, and cannot guarantee the optimality of the scanning result.

Existing methods for adjusting scanning parameters based on artificial experience still have a plurality of limitations, such as difficulty in providing optimal scanning effects for different types of samples, too much dependence on the establishment of experience rules, and the like. Therefore, the existing Raman spectrometer has higher uncertainty by a method of manually adjusting the scanning parameters, can not adaptively adjust the scanning parameters, and is difficult to realize the optimal scanning effect under various different environmental conditions and sample types.

Disclosure of Invention

The invention aims to provide a control model training method, device and equipment of a laser micro-Raman spectrometer, which can adaptively adjust scanning parameters, thereby improving the scanning control accuracy and the operation efficiency and further ensuring the optimality of a scanning result.

The first aspect of the invention discloses a control model training method of a laser micro-Raman spectrometer, which comprises the following steps:

constructing a control model of a laser micro-Raman spectrometer based on a depth Q network;

controlling a laser micro-Raman spectrometer to scan a sample to obtain a first microscopic image and a corresponding first Raman spectrum;

Respectively inputting a first microscopic image and a first Raman spectrum into the control model so as to enable the control model to predict and obtain the predicted action value of each candidate scanning parameter;

The current scanning parameters of the laser micro-Raman spectrometer are adjusted to the candidate scanning parameters with highest predicted action value, and the laser micro-Raman spectrometer is controlled to perform rescanning, so that a second microscopic image and a corresponding second Raman spectrum are obtained;

calculating a target action value according to the second microscopic image and the second Raman spectrum;

And inputting the second microscopic image and the second Raman spectrum into the control model to perform iterative training by taking the target action value as a label so as to optimize network parameters of the control model.

In some embodiments, calculating the target action value from the second microscopic image and the second raman spectrum comprises:

calculating the signal-to-noise ratio of the second Raman spectrum;

Calculating a baseline prize for the second raman spectrum;

calculating an image reward of the second microscopic image;

And calculating the total rewarding value as the target action value according to the signal-to-noise ratio and the baseline rewarding of the second Raman spectrum, the image rewarding of the second microscopic image and the punishment item.

In some embodiments, calculating the signal-to-noise ratio of the second raman spectrum comprises:

Smoothing the second Raman spectrum to obtain a smoothed third Raman spectrum;

according to the second Raman spectrum and the third Raman spectrum, calculating to obtain Raman spectrum noise;

And calculating to obtain the signal-to-noise ratio of the second Raman spectrum according to the mean value of the third Raman spectrum and the standard deviation of the Raman spectrum noise.

In some embodiments, calculating the baseline prize for the second raman spectrum comprises:

performing baseline correction on the second Raman spectrum to obtain a corrected spectrum after baseline correction;

obtaining a spectrum difference value between the second Raman spectrum and the correction spectrum;

and calculating a baseline reward of the second Raman spectrum according to the spectrum difference value.

In some embodiments, calculating the image rewards for the second microscopic image includes:

Respectively calculating a first measurement value of contrast, a second measurement value of image signal-to-noise ratio, a third measurement value of definition and a fourth measurement value of image distortion according to the second microscopic image;

and calculating to obtain the image rewards of the second microscopic image according to the first metric value, the second metric value, the third metric value and the fourth metric value.

The second aspect of the invention discloses a control model training device of a laser micro-Raman spectrometer, comprising:

the construction unit is used for constructing a control model of the laser micro-Raman spectrometer based on the depth Q network;

the first scanning unit is used for controlling the laser micro-Raman spectrometer to scan the sample to obtain a first microscopic image and a corresponding first Raman spectrum;

The prediction unit is used for inputting the first microscopic image and the first Raman spectrum into the control model respectively so that the control model predicts the predicted action value of each candidate scanning parameter;

The adjusting unit is used for adjusting the current scanning parameters of the laser micro-Raman spectrometer to candidate scanning parameters with highest predicted action values;

the second scanning unit is used for controlling the laser micro-Raman spectrometer to perform rescanning to obtain a second microscopic image and a corresponding second Raman spectrum;

the target calculation unit is used for calculating a target action value according to the second microscopic image and the second Raman spectrum;

And the training unit is used for inputting the second microscopic image and the second Raman spectrum into the control model by taking the target action value as a label for iterative training so as to optimize the network parameters of the control model.

In some embodiments, the target computing unit comprises:

A first calculation subunit for calculating a signal-to-noise ratio of the second raman spectrum;

A second calculation subunit for calculating a baseline prize for the second raman spectrum;

A third calculation subunit for calculating an image reward of the second microscopic image;

And the fourth calculating subunit is used for calculating the total rewarding value as the target action value according to the signal-to-noise ratio and the baseline rewarding of the second Raman spectrum, the image rewarding of the second microscopic image and the punishment item.

In some embodiments, the first computing subunit is specifically configured to smooth the second raman spectrum to obtain a smoothed third raman spectrum; according to the second Raman spectrum and the third Raman spectrum, calculating to obtain Raman spectrum noise; and calculating to obtain the signal-to-noise ratio of the second Raman spectrum according to the mean value of the third Raman spectrum and the standard deviation of the Raman spectrum noise.

In some embodiments, the second computing subunit is specifically configured to perform baseline correction on the second raman spectrum, to obtain a corrected spectrum after baseline correction; obtaining a spectrum difference value between the second Raman spectrum and the correction spectrum; and calculating a baseline reward of the second Raman spectrum according to the spectrum difference value.

A third aspect of the invention discloses an electronic device comprising a memory storing executable program code and a processor coupled to the memory; the processor invokes the executable program code stored in the memory for performing the control model training method of the laser micro-raman spectrometer disclosed in the first aspect.

A fourth aspect of the present invention discloses a computer-readable storage medium storing a computer program, wherein the computer program causes a computer to execute the control model training method of the laser micro-raman spectrometer disclosed in the first aspect.

The method has the advantages that a control model of the Raman spectrometer is built based on a depth Q network, the Raman spectrometer is firstly controlled to scan a sample to obtain a microscopic image and a Raman spectrum, the microscopic image and the Raman spectrum are input into the control model to predict and obtain the predicted action value of each candidate scanning parameter, the current scanning parameter of the Raman spectrometer is adjusted to the candidate scanning parameter with the highest predicted action value, the Raman spectrometer is controlled to rescan to obtain a new microscopic image and a new Raman spectrum, the target action value is calculated according to the new microscopic image and the Raman spectrum, finally the target action value is used as a label, the new microscopic image and the Raman spectrum are input into the control model to carry out iterative training to optimize the network parameters of the control model, so that the control model can gradually learn an optimal strategy through continuous iterative training to minimize the difference between the predicted action value and the target action value, and can realize self-adaptive adjustment of the scanning parameter based on the action value of the predicted candidate scanning parameter, thereby improving the accuracy and the operation efficiency of scanning control and guaranteeing the optimality of a scanning result.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles and effects of the invention.

Unless specifically stated or otherwise defined, the same reference numerals in different drawings denote the same or similar technical features, and different reference numerals may be used for the same or similar technical features.

FIG. 1 is a flow chart of a control model training method of a laser micro-Raman spectrometer disclosed in an embodiment of the invention;

FIG. 2 is a schematic diagram of training a control model disclosed in an embodiment of the present invention;

Fig. 3 is a schematic structural diagram of a control model training device of a laser micro-raman spectrometer according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Reference numerals illustrate:

301. a construction unit; 302. a first scanning unit; 303. a prediction unit; 304. an adjusting unit; 305. a second scanning unit; 306. a target calculation unit; 307. a training unit; 401. a memory; 402. a processor.

Detailed Description

Unless defined otherwise or otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In the context of a realistic scenario in connection with the technical solution of the invention, all technical and scientific terms used herein may also have meanings corresponding to the purpose of the technical solution of the invention. The terms "first" and "second" … "as used herein are used merely for distinguishing between names and not necessarily for describing a particular amount or sequence. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

It will be understood that when an element is referred to as being "fixed" to another element, it can be directly fixed to the other element or intervening elements may also be present; when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present; when an element is referred to as being "mounted to" another element, it can be directly mounted to the other element or intervening elements may also be present. When an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present.

As used herein, unless specifically stated or otherwise defined, "the" means that the feature or technical content mentioned or described before in the corresponding position may be the same or similar to the feature or technical content mentioned. Furthermore, the terms "comprising," "including," and "having," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a control model training method of a laser micro-Raman spectrometer, which can be realized through computer programming. The execution main body of the method can be electronic equipment such as a computer, a notebook computer, a tablet computer and the like, or a control model training device of a laser micro-Raman spectrometer embedded in the electronic equipment, and the invention is not limited to the control model training device. In order that the invention may be readily understood, a more particular description of specific embodiments thereof will be rendered by reference to specific embodiments that are illustrated in the appended drawings.

As shown in fig. 1, the method includes the following steps 110-160:

110. and constructing a control model of the laser micro-Raman spectrometer based on the depth Q network.

In the invention, a control model of the laser micro-Raman spectrometer is constructed based on a Deep Q-Network (DQN), and the control model uses a reinforcement learning framework to continuously optimize the current scanning parameter configuration and search for a better scanning parameter configuration, thereby realizing automatic adjustment of the scanning parameters of the laser micro-Raman spectrometer.

Before reinforcement learning is performed, it is necessary to ensure that the scanning parameters of the laser micro-raman spectrometer are properly initialized and that the associated hardware is adjusted to a proper initial state. This initialization step is critical because it provides a stable and reliable starting point for the agent, thereby helping to explore the most efficient scan parameter combinations. Specifically, before executing step 110, it may be performed that: and acquiring the set initial scanning parameters, and initializing and hardware adjusting the laser micro-Raman spectrometer according to the initial scanning parameters.

The set initial scanning parameters comprise a reasonable laser power starting range, integration time, stepping distance and the like, so as to adapt to different samples and acquisition targets. At the same time, the hardware adjustments ensure that the various components of the raman spectrometer, such as the laser, detector and sample stage, are all operating in an optimal state. The preparation work promotes a learning environment meeting the practical application requirements, and lays a foundation for the upcoming learning and parameter adjustment of the intelligent agent.

120. And controlling the laser micro-Raman spectrometer to scan the sample to obtain a first microscopic image and a corresponding first Raman spectrum.

After preliminary parameter initialization and hardware adjustment, the laser microscopic Raman spectrometer is controlled to collect spectrum data of various samples. Wherein the non-multiple samples include, but are not limited to, different biological, material, or chemical samples. In this step, the agent continuously trains and optimizes the control model with data in practice by collecting spectral information from various types of biological, chemical and material samples in experiments. The repeated iteration of the process enables the intelligent agent to gradually master how to adjust the scanning parameters under changeable environmental conditions and sample characteristics so as to obtain optimal data quality and acquisition efficiency. The continuous learning and adaptability improvement in practice ensures the universality and high efficiency of the method, so that the method can adapt to the continuously changing practical application requirements. In addition, the data feedback collected during this process also helps identify any performance bottlenecks or operational problems that may exist with the spectrometer, providing valuable information for further improvements in the spectrometer. Through comprehensive and deep training and optimization, the control model gradually forms strong decision making capability, and the intelligent operation level of the laser microscopic Raman spectrometer is improved. The control model is used to guide the spectrometer to collect data in a variety of samples and to constantly train and optimize the model by practice. Thereby ensuring a wide applicability across different types of biological, chemical and material samples.

130. And respectively inputting the first microscopic image and the first Raman spectrum into a control model so that the control model predicts the predicted action value of each candidate scanning parameter.

Wherein the network structure of the control model is designed to handle two different types of data: namely raman spectral data and microscopic image data. The overall network structure is composed of three main parts: raman spectrum input branches, microimage input branches, feature fusion layers, and fully connected layers. By designing the neural network architecture to handle dual inputs, the complementarity of the two information in terms of molecular vibration recognition can be fully exploited.

The raman spectrum input branch is a special branch of one-dimensional data, and comprises a plurality of one-dimensional convolution layers (the one-dimensional convolution layers can be combined through residual error networks, dense networks and the like) which are used for extracting first characteristics of the raman spectrum data, and an adaptive average pooling layer is arranged next to the first characteristics, so that the first characteristics are reduced in dimension and converted into one-dimensional characteristic vectors through flattening operation.

The microscopic image input branch is a branch specially used for extracting microscopic image characteristics, is similar to Raman spectrum data input in structure, but uses a two-dimensional convolution layer and a pooling layer (likewise, the combined structure of the two-dimensional convolution layer can be combined through a residual network, a dense network and the like), and is designed for processing image data. Similarly, the operation at the end of this branch is also flattening processing. That is, the two-dimensional convolution layer is used to extract the second features of the microimage data, and the pooling layer is used to reduce the second features to dimensions and convert them to two-dimensional feature vectors by a flattening operation. By the optimized design of the two branches, the network can learn and extract key features from respective data types.

The feature fusion layer is then the network part that integrates the features of both data types. Firstly, receiving a first characteristic vector and a second characteristic vector output by a Raman spectrum input branch and a microscopic image input branch, and splicing the first characteristic vector and the second characteristic vector in characteristic dimensions; the spliced feature vectors are subjected to nonlinear combination through a full-connection fusion layer, and then final task specific prediction is performed through an output layer. Of course, the operation of multi-mode distillation can be added to the feature fusion part according to the task requirement, so that the accuracy is improved. The whole process is a multi-modal fusion process, aiming at extracting and integrating useful information from different data sources for training of the DQN network, the training process is shown in figure 2.

And combining the first feature vector of the Raman spectrum data and the second feature vector of the microscopic image data through the feature fusion layer, and realizing data interaction and enhancement in a feature space with higher dimensionality. The neural network architecture integrating the bimodal data features directly contributes to the parameter adjustment decision process in the control model reinforcement learning process by the unique design.

The intelligent agent can adapt to spectrum collection under various sample types and experimental conditions by utilizing the comprehensive characteristics provided by the network, so that a control model such as laser power, integration time and stepping distance is automatically optimized, and the quality and collection efficiency of spectrum data are improved. By setting the network structure of the control model to a framework capable of accepting dual inputs of microscopic images and raman spectra, the image and spectral information can be analyzed simultaneously, and more accurate parameter tuning decisions can be made accordingly.

140. And adjusting the current scanning parameters of the laser micro-Raman spectrometer to the candidate scanning parameters with highest predicted action value, and controlling the laser micro-Raman spectrometer to rescan to obtain a second microscopic image and a corresponding second Raman spectrum.

The predicted action value refers to an estimated Q value output by the control model, and new microscopic images and Raman spectrum data can be obtained by practicing candidate scanning parameters with highest predicted action value. In some embodiments, action selection may be accomplished by applying a balance of exploration and utilization, i.e. "exploration" to discover new, unknown states or behaviors in order to obtain more information, while "utilization" is based on existing knowledge to obtain maximum return by selecting the optimal action.

150. And calculating the target action value according to the second microscopic image and the second Raman spectrum.

In the embodiment of the invention, the target action value refers to the instant rewards+historical rewards calculated according to the new Raman spectrum data and the microscopic image, and is called a target Q value. Before calculating the target Q value, a state space, an action space, a penalty term and a reward function corresponding to the Raman spectrometer are defined first.

Wherein the action space defines the range of action to be taken, which should cover all scanning parameters adjustable in the laser micro raman spectrometer. Specifically, the motion space includes a plurality of candidate scan parameters of the laser micro-raman spectrometer including, but not limited to, laser wavelength, laser power, scan speed, integration time, step distance, sampling interval, and the like. It should be noted that ensuring the completeness of the motion space is critical, since only then can the agent fully explore the possible parameter combinations to find the parameter settings that yield the best spectral quality and microscopic image quality. Wherein the motion space allows the model to be selected among continuous or discrete parameter settings to achieve fine scan control. For example, for a common laser micro raman spectrometer, the parameters that can be adjusted are: percentage of emitted laser powerIntegration time/>，/>Distance of objective lens controlled by motor to objective table/>And/>The motor-controlled optical path and the objective lens move in parallel with the stage, and at the same time the switching/>, of the objective lens. That adjustable set of actions (i.e., action space) may be defined as: /(I)。

Wherein the state space describes the environmental states that the agent can observe. Specifically, the state space contains a plurality of environmental states that affect the raman spectrum and microscopic image quality, and can be divided into two classes, one class being raman spectrum data features such as signal-to-noise ratio, width, intensity, etc., and the other class being microscopic image features associated with the spectrum data such as image sharpness, contrast, resolution, etc. It should be noted that the comprehensiveness of the state space ensures that the agent is able to receive all the necessary information about the current environmental state, which is an integral part of optimizing the action selection and improving the spectrometer scanning parameters. By accurately defining and carefully observing the state space, the agent can more accurately evaluate the effect of its current strategy and make decisions of sufficient information.

Wherein the penalty term is a key parameter in the reinforcement learning model that affects the output of the model by imposing a negative reward on actions that are detrimental to equipment maintenance. Taking the scan parameters of a raman spectrometer as an example, the scan parameter adjustment involving mechanical movement or adjustment requires a corresponding penalty term, which is done in order to avoid frequent mechanical adjustments of the model in order to pursue short-term performance improvements, which may lead to excessive wear or damage to the equipment. The penalty term is set so that the model is more cautious in optimizing the scan parameters, ensuring that high quality data is acquired while reducing potential damage to the hardware. For example, if a certain parameter adjustment causes abnormal wear of a mechanical component, the penalty term will punish the behavior by means of a negative reward, thereby guiding the model to avoid such behavior in future decisions. This means that even if a certain parameter setting can improve the data quality in a short period of time, the model eventually will exclude this setting if it is detrimental to the device.

Wherein the reward term has a direct effect on the candidate scan parameters of high predictive action value output by the control model, encouraging the model to make the correct adjustments in order to achieve higher spectral and image quality. The bonus term is determined by the quality of the raman spectrum and the quality of the microscopic image. High quality spectra and images should correspond to higher prize values, so the model will be driven to search for the parameter space that produced these results. In addition, to encourage the agent not to drill the bullnose and to conduct extensive parameter searches, a base reward (a small magnitude positive reward) is set as part of the model. Such a design helps the agent avoid premature convergence to a locally optimal solution and encourages constant exploration of unknown parameter settings, ultimately potentially finding a scan parameter configuration that is better than currently known.

The bonus term tells the agent which actions are good, i.e. which scan parameter settings result in high quality raman spectra and microscopic images, by providing positive feedback to the model. According to this positive feedback, the model adjusts its strategy, tending to select candidate scan parameters that can improve spectral quality and microscopic image quality, giving higher predictive action value. High quality raman spectra may exhibit clearer characteristic peaks, higher signal-to-noise ratios, while high quality microscopic images may exhibit better contrast and resolution. The smart will learn these good combinations of parameters through accumulated experience and employ them more frequently in subsequent evaluations. The underlying reward acts as a small magnitude positive reward in the model that acts to avoid agents becoming too conservative during the exploration phase due to lack of sufficient forward stimulus. This basic positive incentive creates a safety net that even when the agent tries a new combination of parameters without producing a high quality output, it still gets some positive feedback, motivating it to continue exploring the possible new combinations of parameters.

Specifically, step 150 may include the following steps 1501-1504, not shown:

1501. the signal to noise ratio of the second raman spectrum is calculated.

In the calculation of the signal-to-noise ratio of the raman spectrum, the original spectrum without noise and the corresponding noise cannot be obtained, so that only some algorithms can be used to estimate the signal-to-noise ratio, and the method for estimating the signal-to-noise ratio of the raman spectrum may include the following steps S11-S13, which are not illustrated:

and S11, performing smoothing treatment on the second Raman spectrum to obtain a smoothed third Raman spectrum.

（1）

In the formula (1), the components are as follows,Spectral smoothing algorithms such as SG smoothing, etc. /(I)Is the third raman spectrum after the smoothing process.

S12, calculating to obtain Raman spectrum noise according to the second Raman spectrum and the third Raman spectrum.

And marking the smoothed third Raman spectrum as an original Raman spectrum without noise, and calculating a difference value between the original Raman spectrum and the smoothed third Raman spectrum to obtain Raman spectrum noise. For example, the number of the cells to be processed,Where Noise represents raman spectral Noise, S represents the second raman spectrum, and S _mooth represents the smoothed third raman spectrum.

S13, calculating to obtain the signal-to-noise ratio of the second Raman spectrum according to the average value of the third Raman spectrum and the standard deviation of the Raman spectrum noise.

Wherein, the calculation of the signal-to-noise ratio of the Raman spectrum is shown as follows:

（2）

in the formula (2), the amino acid sequence of the compound, Is the smoothed third Raman spectrum/>Mean value of/(I)Is the standard deviation of raman spectral noise.

1502. A baseline prize for the second raman spectrum is calculated.

For example, step 1502 may include the following steps S21-S23, not shown:

s21, carrying out baseline correction on the second Raman spectrum to obtain a corrected spectrum after baseline correction.

S22, obtaining a spectrum difference value between the second Raman spectrum and the correction spectrum.

Wherein the spectrum difference is represented by DWherein S represents the second Raman spectrum,/>Representing the corrected spectrum.

And S23, calculating to obtain a baseline reward of the second Raman spectrum according to the spectrum difference value.

Wherein the rewards obtained by evaluating the baseline of the spectrum are represented by the following formula (3):

（3）

In the formula (3), the amino acid sequence of the compound, Is the spectral difference. /(I)Is/>Standard deviation of (2). /(I)Is/>Average value of (2). /(I)Is a small positive number to ensure that the denominator is not zero. /(I)Is the largest absolute value in the sequence of differences.

1503. An image reward for the second microscopic image is calculated.

In the step, firstly, a first measurement value of contrast, a second measurement value of image signal to noise ratio, a third measurement value of definition and a fourth measurement value of image distortion are calculated according to a second microscopic image, and then image rewards of the second microscopic image are calculated according to the first measurement value, the second measurement value, the third measurement value and the fourth measurement value.

The image rewards calculated on the basis of the second microscopic image are expressed as IQE, by way of example, with the following calculation formula:

（4）

wherein, confast represents a first metric value, sr represents a second metric value, definition represents a third metric value, expression represents a fourth metric value, and w _c、w_s、w_R、w_D is the weight of the first metric value, the second metric value, the third metric value, and the fourth metric value, respectively.

Wherein, the first metric value of the contrast of the microscopic image is calculated as follows:

（5）

In the formula (5), the amino acid sequence of the compound, Is a second microscopic image,/>Is the symbol of acquiring the gray level map of an image,/>The standard deviation of the obtained gray level map is calculated by/>Is a hyperbolic tangent function that maps score scores into a range.

Wherein, the calculation of the second metric value of the image signal-to-noise ratio is as follows:

（6）

In the formula (6), the amino acid sequence of the compound, Is a laplace operator.

Wherein, the calculation of the third metric value of the microscopic image definition is shown as the following formula:

（7）

in the formula (7), the amino acid sequence of the compound, Is the mean of the image,/>The representation is to variance the acquired gray-scale map,/>Is a small positive number to ensure that the denominator is not zero.

Wherein, the calculation of the fourth metric value of microscopic image distortion is shown as the following formula:

（8）

in the formula (8), the amino acid sequence of the compound, Is detected in the second microscopic image/>, by hough line transformationLine number of/(v)Is the total number of pixels of the second microscopic image.

1504. And calculating the total rewards value as the target action value according to the signal-to-noise ratio and the baseline rewards of the second Raman spectrum, the image rewards of the second microscopic image and the punishment items.

In an embodiment of the invention, a reward function is first set, comprising a reward term and a penalty term, wherein the reward term reflects the quality of the second microscopic image and the second raman spectrum, and the penalty term prevents damage to spectrometer hardware.

Wherein the total prize value R (s, a) is calculated by the following equation (9):

Wherein, Is the basic reward for making a decision, and is a definable value. /(I)Is the signal to noise ratio of the second raman spectrum; b represents BaselineEval, which is a baseline reward obtained by evaluating the baseline of the second Raman spectrum, and an ideal Raman spectrum baseline is flat and has no fluctuation except the Raman peak and is as close to 0 as possible; IQE represents image rewards, integrating multiple factors affecting microscopic image quality, including sharpness, detail contrast, background noise, etc.; /(I)Is a punishment item increasing with laser power, and tends to obtain better Raman spectrum signals by using low power; /(I)The penalty term increasing along with the integration time is adopted, so that the delay of an experiment result is avoided; /(I)Is punishment for switching the objective lens, and avoids damage to the objective lens caused by frequent switching. /(I)、/>、/>Punishment caused by driving motors in all directions to run is larger as the moving distance is larger; /(I)Is a very large penalty brought by exceeding the range of movement of the objective lens, and when exceeding the range of movement, the instrument may be damaged, giving a very large penalty to avoid damaging the instrument. /(I)、/>、/>、/>、/>、/>、/>、、/>Respectively, corresponding weight coefficients.

160. And taking the target action value as a label, and inputting the second microscopic image and the second Raman spectrum into the control model for iterative training so as to optimize network parameters of the control model.

In the iterative training process, network parameters of the control model are updated by minimizing a mean square error loss function of the Q value, so that the estimated Q value is more and more approximate to the target Q value.

In summary, the embodiment of the invention is implemented by applying the reinforcement learning algorithm to the adaptive parameter optimization of the laser micro-raman spectrometer. In this method, the agent simulates an actual laser raman spectrometer and adjusts the scanning parameters based on observed experimental data, adaptation and self-optimization. This agent uses a reinforcement learning algorithm to calculate rewards or penalties for each feasible action taken in the current state in order to optimize the scan parameters. Specifically, the agent will gradually learn the scan parameter settings that match the optimal performance in the interactions of one time with the actual state and the execution of the actions to obtain more accurate and reliable raman spectrum and microimage results. Thus, by deeply analyzing the performance and operation key points of the laser micro-Raman spectrometer, the potential of the DQN reinforcement learning technology is utilized to automatically adjust the scanning parameters of the laser micro-Raman spectrometer. The method greatly reduces the labor intensity of manual parameter adjustment of a user, reduces the uncertainty of manual parameter adjustment, improves the scanning accuracy and improves the operation efficiency. Through actual testing and result analysis, the invention can obviously improve the operation flow, and has wide application prospect in the field of optimizing application of a laser microscopic Raman spectrometer, including the fields of medical treatment, environmental science, chemistry, physics and the like.

As shown in fig. 3, the embodiment of the invention discloses a control model training device of a laser micro-raman spectrometer, which comprises a construction unit 301, a first scanning unit 302, a prediction unit 303, an adjustment unit 304, a second scanning unit 305, a target calculation unit 306 and a training unit 307, wherein,

A construction unit 301, configured to construct a control model of the laser micro-raman spectrometer based on the depth Q network;

The first scanning unit 302 is configured to control the laser micro-raman spectrometer to scan the sample to obtain a first microscopic image and a corresponding first raman spectrum;

A prediction unit 303, configured to input the first microscopic image and the first raman spectrum into a control model respectively, so that the control model predicts a predicted action value of each candidate scan parameter;

the adjusting unit 304 is configured to adjust a current scanning parameter of the laser micro-raman spectrometer to a candidate scanning parameter with the highest predicted action value;

A second scanning unit 305, configured to control the laser micro-raman spectrometer to perform rescanning, so as to obtain a second microscopic image and a corresponding second raman spectrum;

a target calculating unit 306, configured to calculate a target action value according to the second microscopic image and the second raman spectrum;

And the training unit 307 is configured to input the second microimage and the second raman spectrum into the control model for iterative training with the target action value as a tag, so as to optimize network parameters of the control model.

As an alternative embodiment, the target computing unit 306 includes the following sub-units, not shown:

Optionally, the first computing subunit is specifically configured to perform smoothing processing on the second raman spectrum to obtain a smoothed third raman spectrum; according to the second Raman spectrum and the third Raman spectrum, calculating to obtain Raman spectrum noise; and calculating to obtain the signal-to-noise ratio of the second Raman spectrum according to the mean value of the third Raman spectrum and the standard deviation of the Raman spectrum noise.

Optionally, the second calculating subunit is specifically configured to perform baseline correction on the second raman spectrum to obtain a corrected spectrum after baseline correction; obtaining a spectrum difference value between the second Raman spectrum and the correction spectrum; and calculating a baseline reward of the second Raman spectrum according to the spectrum difference value.

Optionally, the third calculating subunit is specifically configured to calculate, according to the second microscopic image, a first measurement value of a contrast ratio, a second measurement value of an image signal-to-noise ratio, a third measurement value of sharpness, and a fourth measurement value of image distortion of the second microscopic image; and calculating to obtain the image rewards of the second microscopic image according to the first metric value, the second metric value, the third metric value and the fourth metric value.

As shown in fig. 4, an embodiment of the present invention discloses an electronic device including a memory 401 storing executable program codes and a processor 402 coupled with the memory 401;

The processor 402 invokes executable program codes stored in the memory 401, and executes the control model training method of the laser micro-raman spectrometer described in the above embodiments.

The embodiment of the invention also discloses a computer readable storage medium storing a computer program, wherein the computer program causes a computer to execute the control model training method of the laser micro-raman spectrometer described in each embodiment.

The foregoing embodiments are provided for the purpose of exemplary reproduction and deduction of the technical solution of the present invention, and are used for fully describing the technical solution, the purpose and the effects of the present invention, and are used for enabling the public to understand the disclosure of the present invention more thoroughly and comprehensively, and are not used for limiting the protection scope of the present invention.

The above examples are also not an exhaustive list based on the invention, and there may be a number of other embodiments not listed. Any substitutions and modifications made without departing from the spirit of the invention are within the scope of the invention.

Claims

1. The control model training method of the laser micro-Raman spectrometer is characterized by comprising the following steps of:

inputting the second microscopic image and the second Raman spectrum into the control model for iterative training by taking the target action value as a label so as to optimize network parameters of the control model;

Wherein calculating the target action value from the second microscopic image and the second raman spectrum comprises:

calculating the signal-to-noise ratio of the second Raman spectrum;

Calculating a baseline prize for the second raman spectrum;

calculating an image reward of the second microscopic image;

Calculating a total rewarding value as a target action value according to the signal-to-noise ratio and the baseline rewarding of the second Raman spectrum, the image rewarding of the second microscopic image and the punishment item;

Wherein calculating the baseline prize for the second raman spectrum comprises:

Calculating a baseline reward of the second Raman spectrum according to the spectrum difference value;

Wherein calculating the image rewards of the second microscopic image comprises:

2. The method of training a control model of a laser micro-raman spectrometer according to claim 1, wherein calculating a signal-to-noise ratio of the second raman spectrum comprises:

Smoothing the second Raman spectrum to obtain a smoothed third Raman spectrum;

3. A control model training device for a laser micro-raman spectrometer, comprising:

The training unit is used for inputting the second microscopic image and the second Raman spectrum into the control model to perform iterative training by taking the target action value as a label so as to optimize network parameters of the control model;

wherein the target computing unit comprises the following subunits:

A fourth calculation subunit, configured to calculate, according to the signal-to-noise ratio and the baseline reward of the second raman spectrum, the image reward of the second microscopic image, and the penalty term, a total reward value as a target action value;

the second calculation subunit is specifically configured to perform baseline correction on the second raman spectrum, so as to obtain a corrected spectrum after baseline correction; obtaining a spectrum difference value between the second Raman spectrum and the correction spectrum; calculating a baseline reward of the second Raman spectrum according to the spectrum difference value;

The third calculation subunit is specifically configured to calculate, according to the second microscopic image, a first measurement value of contrast, a second measurement value of signal-to-noise ratio of the image, a third measurement value of sharpness, and a fourth measurement value of distortion of the image; and calculating to obtain the image rewards of the second microscopic image according to the first metric value, the second metric value, the third metric value and the fourth metric value.

4. The control model training apparatus of a laser micro-raman spectrometer according to claim 3, wherein the first computing subunit is specifically configured to smooth the second raman spectrum to obtain a smoothed third raman spectrum; according to the second Raman spectrum and the third Raman spectrum, calculating to obtain Raman spectrum noise; and calculating to obtain the signal-to-noise ratio of the second Raman spectrum according to the mean value of the third Raman spectrum and the standard deviation of the Raman spectrum noise.

5. An electronic device comprising a memory storing executable program code and a processor coupled to the memory; the processor invokes the executable program code stored in the memory for performing the control model training method of the laser micro-raman spectrometer of claim 1 or 2.