CN116109982B - Biological sample collection validity checking method based on artificial intelligence - Google Patents

Biological sample collection validity checking method based on artificial intelligence Download PDF

Info

Publication number
CN116109982B
CN116109982B CN202310121257.1A CN202310121257A CN116109982B CN 116109982 B CN116109982 B CN 116109982B CN 202310121257 A CN202310121257 A CN 202310121257A CN 116109982 B CN116109982 B CN 116109982B
Authority
CN
China
Prior art keywords
input
output
function
conv2d
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310121257.1A
Other languages
Chinese (zh)
Other versions
CN116109982A (en
Inventor
刘志岩
张东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Xingyun Intelligent Manufacturing Technology Co ltd
Original Assignee
Harbin Xingyun Intelligent Manufacturing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Xingyun Intelligent Manufacturing Technology Co ltd filed Critical Harbin Xingyun Intelligent Manufacturing Technology Co ltd
Priority to CN202310121257.1A priority Critical patent/CN116109982B/en
Publication of CN116109982A publication Critical patent/CN116109982A/en
Application granted granted Critical
Publication of CN116109982B publication Critical patent/CN116109982B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses an artificial intelligence-based biological sample collection validity test method, which comprises the following steps: s1, acquiring an oral cavity image and a pharyngeal swab acquisition process image of a user through a high-definition video camera; s2, preprocessing the image to obtain a video frame; s3, calculating polygon vertex coordinates of a pharyngeal collecting area through a collecting part indication algorithm; s4, receiving an input video frame through a swab head identification algorithm, outputting polygon vertex coordinate information of a pharyngeal swab head in the video frame, and identifying a pharyngeal swab head part; s5, comprehensively evaluating the contact degree and the contact position by combining an original image through a swab head and detection position contact detection algorithm, and giving corresponding scores; s6, detecting and judging a plurality of continuous video frames of the video stream through an action completion evaluation algorithm. The invention solves the problem that the prior art lacks a recognition mechanical arm for effectively contacting the collecting part of the pharyngeal swab with the swab head and completing the inspection.

Description

Biological sample collection validity checking method based on artificial intelligence
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an artificial intelligence-based biological sample collection validity checking method.
Background
At present, the nucleic acid detection is an effective measure for coping with new coronary epidemic situations, and the field of nucleic acid detection generally comprises an acquisition end and a detection end. At the collection end, manual collection is the main, and each sampling pavilion needs to be provided with a data entry member and a nurse to collect pharyngeal swabs. In order to save human resources, the adoption of an unattended automatic nucleic acid collection booth mode is one direction of research of related factories in the field at present.
The unmanned automatic collection pavilion is realized in most of the current modes that a robot uses a mechanical arm clamp to clamp a swab to extend into the oral cavity of a person to be detected for collecting the pharyngeal swab. This approach has three distinct disadvantages: 1, the mechanical arm generates no small psychological pressure on the detected personnel; 2, in the stroke of the mechanical arm, the evasion and displacement of the detected personnel can cause acquisition failure, so that the acquisition time is long and the acquisition efficiency is low; and 3, the use cost and the maintenance cost of the mechanical arm are high, so that the mechanical arm is not beneficial to point distribution and maintenance.
The defect in the prior art is that a method for identifying whether an acquisition part of a throat swab is effectively contacted with a swab head in the acquisition of the throat swab by a mechanical arm and completing the inspection is lacking.
Disclosure of Invention
Aiming at the problem that whether the collecting part is effectively contacted with the swab head and the test is finished in the collecting of the pharyngeal swab by the identification mechanical arm is lacking in the prior art, the invention provides an artificial intelligence-based biological sample collecting validity test method.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
an artificial intelligence based biological sample collection validity test method comprises the following steps:
s1, acquiring an oral cavity image and a pharyngeal swab acquisition process image of a user through a high-definition video camera;
s2, preprocessing the image to obtain a video frame;
s3, calculating polygon vertex coordinates of a pharyngeal collecting area through a collecting part indication algorithm;
s4, receiving an input video frame through a swab head identification algorithm, outputting polygon vertex coordinate information of a pharyngeal swab head in the video frame, and identifying a pharyngeal swab head part;
s5, comprehensively evaluating the contact degree and the contact position by combining an original image through a swab head and detection position contact detection algorithm, and giving corresponding scores;
s6, detecting and judging a plurality of continuous video frames of the video stream through an action completion evaluation algorithm. And when the acquisition action of the video stream record reaches the algorithm appointed condition, judging a classification result of the completion of the acquisition action.
Further, the acquisition part indication algorithm is a UNet image segmentation algorithm, and the definition of the UNet network structure function of the UNet image segmentation algorithm comprises the following steps:
s301, defining a Conv2d two-dimensional graph convolution function, wherein the calculation formula is as follows:
wherein:
the x symbols are cross-correlation operators, here implemented using matrix convolution operations;
n is the number of image frames input by one time calculation, namely, a batch-size value;
C in is the number of channels of the input image, C out The number of channels of the output image is c=3 for the original RGB image and the actual number of channels of the feature map for the feature map of the intermediate layer;
input represents the input map four-dimensional tensor, the tensor shape being a four-tuple (picture element width, picture element height, number of picture channels, batch-size value).
output represents the output map four-dimensional tensor, the tensor shape being a four-tuple (picture element width, picture element height, number of picture channels, batch-size value).
The padding function represents expanding the input four-dimensional graph tensor size by P pixel values.
weight represents a four-dimensional tensor of a convolution kernel weight parameter, and the tensor is shaped as a quadruple (convolution kernel width, convolution kernel height, number of output image channels, batch-size value). The prescribed Conv2d operation fixing uses 3*3 convolution kernel, and the step S is fixed to 1.
bias represents the bias weight tensor calculated by Conv2d this time, and the shape is consistent with weight.
The input-output image size of Conv2d is calculated with reference to the following formula:
wherein:
W in is the input image size;
W out is the output image size;
ks is the convolution kernel size;
p and S are consistent with Conv2d operation definition;
in the definition of Conv2d operation, we take p=1, s=1, ks=3, so the output image size of Conv2d operation is the same as the input:
s302, defining classification functions, wherein 1*1 convolution kernels are adopted for fixation, and the classification functions are completely consistent with Conv2d and are marked as Class2d;
s303, defining a Maxpool function:
the max operation is to assign 2 x 2 kernels to perform element maximum value calculation according to step length 2, so that the size relation of input and output images can be obtained:
W out =W in /2
H out =H in /2
s304, defining a Relu function as an activation function of a conv2d function:
output(N,k,W,H)=relu(input(N,k,W,H))=max(0,input(N,k,W,H))
s305, defining an up sampling function of the sampled:
wherein Weight upsamples the Weight tensor.
S306, defining a channel feature map stacking function:
wherein. C (C) out =C in1 +C in2 The + operator simply stacks the two feature map channels of the same input size.
S307, defining a feature map residual network action judging function resnet:
firstly defining a residual network characteristic extraction function of a certain layer l:
output l+1 =input l +F(input l +weight l )
where input and output are input and output tensors, and the size is determined by (N, C, W, H).
The F calculation involves a single sequential calculation combination of arithmetic normalization for each element of the tensor, conv2d calculation, and Relu calculation.
Continuing to define a residual network feature extraction function for any layer L starting from layer 1:
s308 defines a full link layer fc and a softmax layer function fc_softmax,
the softmax function is:
the fc_softmax function is:
output o =fc_softmax(input i )=softmax(weight i,o ×input i +bais i,o )
where weight is the linear weight matrix and bias is the bias matrix.
After passing through the full connection layer, carrying out one-dimensional stretching on tensors with the sizes of (N, C, W, H) to obtain multi-batch one-dimensional tensors with the sizes of (N, C, W, H), and then calculating through a weight matrix and a bias matrix to obtain the final judging probability value meeting the acquisition action completion standard and not meeting the acquisition action completion standard.
Further, on the basis of the UNet image segmentation algorithm, the step of constructing an acquisition part indication algorithm comprises the following steps:
s3001, input 1920 x 1080-sized Yuv video frame input through a video acquisition device 001
S3002, preprocessing the image, inputting 001 The color channels are converted to BGR channels,
the image is centrally intercepted to obtain 512 x 3 tensor input data output 002 Where the first two 512 represent the image width and height and the third parameter 3 represents the blue, green, and red color component channels.
S3003, setting input video frame input n =relu(output n-1 ) Or input n =output n-1 Performing Conv2d, maxpool, upsampled and channel calculation until a 64-channel feature map of the tensor output data output036 acquisition part indication area is obtained;
s3004, repeating the step S3003 to obtain a 64-channel feature map of the swab head identification area, which is marked as input' 036
S3005, set input 037 =relu(output 036 ) And perform Conv2d operation
output 037 (N=8,C out =2,W=512,H=512)=Class2d(input 037 ,N=8,C in =64,C out =2,W=512,H=512,P=1)
Through the steps, the pixel point classification information of the collecting part indication area and the swab head identification area can be obtained respectively, wherein one classification is background pixel points, and the other classification is collecting part or swab head pixel points.
The two classifications are randomly initialized at the beginning of the model operation, and the accuracy of the classification information can be gradually improved through the following training steps.
Further, cross entropy evaluation is carried out on the pixel point classification information in the step S3005 through a trained loss function, and inverse gradient calculation is carried out; defining a loss function of training by adopting a cross entropy function, wherein the formula is as follows:
will output 037 And performing cross entropy evaluation on the two classification calculation results and the two classification target results of the input video annotation, and performing inverse gradient calculation. Weight and bias values in functions such as Conv2d, class2d, upsampled and the like are updated so that H (p, q) tends to be minimum, and an acquisition part indication and swab head recognition algorithm model can be obtained. The acquisition part and the swab head are marked in the video frame by mixing the original pixel values of the acquisition part and the swab head pixels indicated by the acquisition part indication and the swab head identification algorithm respectively with different colors, so that a person to be tested is guided to perform acquisition operation.
Further, the construction of the contact detection algorithm continues on the basis of the result of [ S036 ].
S5001, set input 038 =relu(output 036 ) Input 'is set' 038 =relu(output’ 036 ) And perform a channel cat operation
S5002, set input 039 =output 038 And perform Conv2d operation
output 039 (N=8,C out =64,W=512,H=512)=Conv2d(input 039 ,N=8,C in =128,C out =64,W=512,H=512,P=1)
S5003, set input 040 =relu(output 039 ) And perform Conv2d operation
output 040 (N=8,C out =64,W=512,H=512)=Conv2d(input 040 ,N=8,C in =64,C out =64,W=512,H=512,P=1)
S5004, set input 041 =relu(output 040 ) And performs a network depth of 8 resnet 8 Calculation of
output 041 (N=8,Co ut =64,W=512,H=512)=resnet 8 (input 041 ,N=8,C in =64,W=512,H=512,P=1)
S5005, set input 042 =output 041 And carrying out characteristic probability result discrimination of fc_softmax, wherein discrimination results are a contact classification result and a non-contact classification result:
output 042 (N=8,o=2)=fc_softmax(input 042 (N=8,i=64))
s5006, obtaining a contact probability index of the pharyngeal swab and the pharyngeal collecting part in the current video frame through the steps S5001-S5005, wherein the value range is between (0-1).
Further, after the contact detection algorithm provides the contact probability value, finally defining an action completion evaluation algorithm to realize a final decision on whether the sampling action is completed:
defining a decision function:
defining a scoring function:
wherein fps is the current video stream frame rate, cp is the confidence probability, and the current acquisition action completion score is obtained after accumulating the judgment value J of the video frames within 2 seconds. A value in the score threshold range (0, fps x 2-1) may be set as the action completion threshold, and acquisition may be considered complete when score is greater than or equal to this threshold.
Compared with the prior art, the invention has the following beneficial effects:
the algorithm module provided by the invention can be used for an unattended self-help nucleic acid sampling pavilion, and can be matched with a monocular high-definition camera to realize the completion discrimination of actions in the self-help sampling process of a person to be detected. Compared with the mechanical arm type nucleic acid acquisition scheme, the mental pressure on the detected personnel is minimum, the one-time acquisition success rate and the acquisition efficiency are higher, and the construction cost is lower.
The invention integrates four algorithm modules (an acquisition part indication algorithm, a swab head identification algorithm, a contact detection algorithm and an action completion evaluation algorithm) into one video processing flow, and one set of flow can complete three tasks proposed by a nucleic acid acquisition process on an algorithm model, namely acquisition part indication, pharyngeal swab head identification and indication and action completion detection and evaluation.
Through hidden layer sharing and result utilization in the model, computing platform resources required by the algorithm module are saved to the greatest extent. The reasoning process of the algorithm module can be operated on low-end cpu such as intel i3 12100f, and the algorithm deployment cost is greatly saved due to no special requirement on the cpu, and the algorithm module also has the capability of being operated on mobile low-power consumption platforms such as android.
Compared with the common Unet model, the conv2d calculation does not change the size of the input feature map by introducing the convolution kernel and padding with special sizes in the conv2d calculation, so that the operation amount is reduced, and the model has stronger universality. Not only is suitable for nucleic acid detection scenes, but also is suitable for other self-service sample acquisition scenes.
Drawings
FIG. 1 is a general flow chart of an artificial intelligence based biological specimen collection validity test method of the present invention;
FIG. 2 is a flowchart illustrating steps of an acquisition location indication algorithm according to an embodiment of the present invention;
FIG. 3 is a flow chart of a touch detection algorithm constructed in accordance with an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to examples and drawings, to which reference is made, but which are not intended to limit the scope of the invention.
As shown in fig. 1, the present embodiment provides an artificial intelligence based biological sample collection validity test method, which includes the steps of: s1, acquiring an oral cavity image and a pharyngeal swab acquisition process image of a user through a high-definition video camera; s2, preprocessing the image to obtain a video frame; s3, calculating polygon vertex coordinates of a pharyngeal collecting area through a collecting part indication algorithm; s4, receiving an input video frame through a swab head identification algorithm, outputting polygon vertex coordinate information of a pharyngeal swab head in the video frame, and identifying a pharyngeal swab head part; s5, comprehensively evaluating the contact degree and the contact position by combining an original image through a swab head and detection position contact detection algorithm, and giving corresponding scores; s6, detecting and judging a plurality of continuous video frames of the video stream through an action completion evaluation algorithm. And when the acquisition action of the video stream record reaches the algorithm appointed condition, judging a classification result of the completion of the acquisition action.
The acquisition part indication algorithm is optimized on the basis of a UNet image segmentation algorithm, and the definition of a Unet network structure function of the UNet image segmentation algorithm comprises the following steps:
s301, defining a Conv2d two-dimensional graph convolution function, wherein the calculation formula is as follows:
wherein:
the x symbols are cross-correlation operators, here implemented using matrix convolution operations;
n is the number of image frames input by one time calculation, namely, a batch-size value;
C in is the number of channels of the input image, C out The number of channels of the output image is c=3 for the original RGB image and the actual number of channels of the feature map for the feature map of the intermediate layer;
input represents the input map four-dimensional tensor, the tensor shape being a four-tuple (picture element width, picture element height, number of picture channels, batch-size value).
output represents the output map four-dimensional tensor, the tensor shape being a four-tuple (picture element width, picture element height, number of picture channels, batch-size value).
The padding function represents expanding the input four-dimensional graph tensor size by P pixel values.
weight represents a four-dimensional tensor of a convolution kernel weight parameter, and the tensor is shaped as a quadruple (convolution kernel width, convolution kernel height, number of output image channels, batch-size value). The prescribed Conv2d operation fixing uses 3*3 convolution kernel, and the step S is fixed to 1.
bias represents the bias weight tensor calculated by Conv2d this time, and the shape is consistent with weight.
The input-output image size of Conv2d is calculated with reference to the following formula:
wherein:
W in is the input image size;
W out is the output image size;
ks is the convolution kernel size;
p and S are consistent with Conv2d operation definition;
in the definition of Conv2d operation, we take p=1, s=1, ks=3, so the output image size of Conv2d operation is the same as the input:
s302, defining classification functions, wherein 1*1 convolution kernels are adopted for fixation, and the classification functions are completely consistent with Conv2d and are marked as Class2d;
s303, defining a Maxpool function:
the max operation is to assign 2 x 2 kernels to perform element maximum value calculation according to step length 2, so that the size relation of input and output images can be obtained:
W out =W in /2
H out =H in /2
s304, defining a Relu function as an activation function of a conv2d function:
output(N,k,W,H)=relu(input(N,k,W,H))=max(0,input(N,k,W,H))
s305, defining an up sampling function of the sampled:
wherein Weight upsamples the Weight tensor.
S306, defining a channel feature map stacking function:
wherein. C (C) out =C in1 +C in2 The + operator simply stacks the two feature map channels of the same input size.
S307, defining a feature map residual network action judging function resnet:
firstly, defining a residual network characteristic extraction function of a certain layer 1:
output l+1 =input l +F(input l +weight l )
where input and output are input and output tensors, and the size is determined by (N, C, W, H).
The F calculation involves a single sequential calculation combination of arithmetic normalization for each element of the tensor, conv2d calculation, and Relu calculation.
Continuing to define a residual network feature extraction function for any layer L starting from layer L:
s308 defines a full link layer fc and a softmax layer function fc_softmax,
the softmax function is:
the fc_softmax function is:
output o =fc_softmax(input i )=softmax(weight i,o ×input i +bais i,o )
where weight is the linear weight matrix and bias is the bias matrix.
After passing through the full connection layer, carrying out one-dimensional stretching on tensors with the sizes of (N, C, W, H) to obtain multi-batch one-dimensional tensors with the sizes of (N, C, W, H), and then calculating through a weight matrix and a bias matrix to obtain the final judging probability value meeting the acquisition action completion standard and not meeting the acquisition action completion standard.
As shown in fig. 2, on the basis of the UNet image segmentation algorithm, the steps of constructing an acquisition part indication algorithm include:
s3001, input 1920 x 1080-sized Yuv video frame input through a video acquisition device 001
S3002, preprocessing the image, and changing input 001 The color channels are converted to BGR channels,
the image is centrally truncated to obtain 512 x 3 tensor input data output002, wherein the first two 512 represent the width and height of the image, and the third parameter 3 represents the three color component channels of blue, green and red.
S3003, setting input video frame input n =relu(output n-1 ) Or input n =output n-1 Performing Conv2d, maxpool, upsampled and channel calculation until a 64-channel feature map of the tensor output data output036 acquisition part indication area is obtained; the method comprises the following specific steps:
set input 003 =output 002 And perform Conv2d operation
output 003 (N=8,C out =64,W=512,H=512)=Conv2d(input 003 ,N=8,C in =3,C out =64,W=512,H=512,P=1)
Set input 004 =relu(output 003 ) And perform Conv2d operation
output 004 (N=8,C out =64,W=512,H=512)=Conv2d(input 004 ,N=8,C in =64,C out =64,W=512,H=512,P=1)
Set input 005 =relu(output 004 ) And performing Maxpool operation
output_005(N=8,C=64,W=256,H=256)=maxpool(input_005,N=8,C=64,W=512,H=512)
Set input 006 =output 005 And perform Conv2d operation
output 006 (N=8,C out =128,W=256,H=256)=Conv2d(input 006 ,N=8,C in =64,C out =128,W=256,H=256,P=1)
Set input 007 =relu(output 006 ) And perform Conv2d operation
output 007 (N=8,C out =128,W=256,H=256)=Conv2d(input 007 ,N=8,C in =128,C out =128,W=256,H=256,P=1)
Set input 008 =relu(output 007 ) And performing Maxpool operation
output 008 (N=8,C=128,W=128,H=128)=maxpool(input 008 ,N=8,C=128,W=256,H=256)
Set input 009 =output 008 And perform Conv2d operation
output 009 (N=8,C out =256,W=128,H=128)=Conv2d(input 009 ,N=8,C in =128,C out =256,W=128,H=128,P=1)
Set input 010 =relu(output 009 ) And perform Conv2d operation
output 010 (N=8,C out =256,W=128,H=128)=Conv2d(input 010 ,N=8,C in =256,C out =256,W=128,H=128,P=1)
Set input 011 =relu(output 010 ) And performing Maxpool operation
output 011 (N=8,C=256,W=64,H=64)=maxpool(input 011 ,N=8,C=256,W=128,H=128)
Set input 012 =output 011 And perform Conv2d operation
output 012 (N=8,C out =512,W=64,H=64)=Conv2d(input 012 ,N=8,C in =256,C out =512,W=64,H=64,P=1)
Set input 013 =relu(output 012 ) And perform Conv2d operation
output 013 (N=8,C out =512,W=64,H=64)=Conv2d(input 013 ,N=8,C in =512,C out =512,W=64,H=64,P=1)
Set input 014 =relu(output 013 ) And performing Maxpool operation
output 014 (N=8,C=512,W=32,H=32)=maxpool(input 014 ,N=8,C=512,W=64,H=64)
Set input 015 =output 014 And perform Conv2d operation
output 015 (N=8,C out =1024,W=32,H=32)=Conv2d(input 015 ,N=8,C in =512,C out =1024,W=32,H=32,P=1)
Set input 016 =relu(output 015 ) And perform Conv2d operation
output 016 (N=8,C out =1024,W=32,H=32)=Conv2d(input 016 ,N=8,C in =1024,C out =512,W=32,H=32,P=1)
Set input 017 =relu(output 016 ) And perform the upsampled operation
output 017 (N=8,C=1024,W=64,H=64)=upsampled(input 017 ,N=8,C=1024,W=32,H=32)
Set input 018 =output 017 And perform Conv2d operation
output 018 (N=8,C out =512,W=64,H=64)=Conv2d(input 018 ,N=8,C in =1024,C out =512,W=64,H=64,P=1)
Set input 019 =relu(output 018 ) And perform a channel cat operation
Set input 020 =output 019 And perform Conv2d operation
output 020 (N=8,C out =512,W=64,H=64)=Conv2d(input 020 ,N=8,C in =1024,C out =512,W=64,H=64,P=1)
Set input 021 =relu(output 020 ) And perform Conv2d operation
output 021 (N=8,C out =512,W=64,H=64)=Conv2d(input 021 ,N=8,C in =512,C out =512,W=64,H=64,P=1)
Set input 022 =relu(output 021 ) And perform the upsampled operation
output o22 (N=8,C=512,W=128,H=128)=upsampled(input 022 ,N=8,C=512,W=64,H=64)
Set input 023 =output 022 And perform Conv2d operation
output 023 (N=8,C out =256,W=128,H=128)=Conv2d(input 023 ,N=8,C in =512,C out =256,W=128,H=128,P=1)
Set input 024 =relu(output 023 ) And perform a channel cat operation
Set input 025 =output 024 And perform Conv2d operation
output 025 (N=8,C out =256,W=128,H=128)=Conv2d(input 025 ,N=8,C in =512,C out =256,W=128,H=128,P=1)
Set input 026 =relu(output 025 ) And perform Conv2d operation
output 026 (N=8,C out =256,W=128,H=128)=Conv2d(input 026 ,N=8,C in =256,C out =256,W=128,H=128,P=1)
Set input 027 =relu(output 026 ) And perform the upsampled operation
output 027 (N=8,C=256,W=256,H=256)=upsampled(input 027 ,N=8,C=256,W=128,H=128)
Set input 028 =output 027 And perform Conv2d operation
output 028 (N=8,C out =128,W=256,H=256)=Conv2d(input 028 ,N=8,C in =256,C out =128,W=256,H=256,P=1)
Set input 029 =relu(output 028 ) And perform a channel cat operation
Set input 030 =output 029 And perform Conv2d operation
output 030 (N=8,C out =128,W=256,H=256)=Conv2d(input 030 ,N=8,C in =256,C out =128,W=256,H=256,P=1)
Set input 031 =relu(output 030 ) And perform Conv2d operation
output 031 (N=8,C out =128,W=256,H=256)=Conv2d(input 031 ,N=8,C in =128,C out =128,W=256,H=256,P=1)
Set input 032 =relu(output 031 ) And perform the upsampled operation
output 032 (N=8,C=128,W=512,H=512)=upsampled(input 032 ,N=8,C=128,W=256,H=256)
Set input 033 =output 032 And perform Conv2d operation
output 033 (N=8,C out =64,W=512,H=512)=Conv2d(input 033 ,N=8,C in =128,C out =64,W=512,H=512,P=1)
Set input 034 =relu(output 033 ) And perform a channel cat operation
Set input 035 =output 034 And perform Conv2d operation
output 035 (N=8,C out =64,W=512,H=512)=Conv2d(input 035 ,N=8,C in =128,C out =64,W=512,H=512,P=1)
Setting upinput 036 =relu(output 035 ) And perform Conv2d operation
output 036 (N=8,C out =64,W=512,H=512)=Conv2d(input 036 ,N=8,C in =64,C out =64,W=512,H=512,P=1)
This time, output 036 Namely a 64-channel characteristic map of the acquisition part indication area.
S3004, repeating the step S3003 to obtain a 64-channel characteristic diagram of the swab head identification area, which is marked as input' 036
S3005, set input 037 =relu(output 036 ) And perform Conv2d operation
output 037 (N=8,C out =2,W=512,H=512)=Class2d(input 037 ,N=8,C in =64,C out =2,W=512,H=512,P=1)
Through the steps, the pixel point classification information of the acquisition part indication area and the swab head identification area can be obtained respectively, one classification is the background pixel point, the other classification is the acquisition part or the swab head pixel point, the two classifications are randomly initialized at the beginning of model operation, and the accuracy of the classification information can be gradually improved through the following training steps.
Performing cross entropy evaluation on the pixel point classification information in the step S3005 through a trained loss function and performing inverse gradient calculation; defining a loss function of training by adopting a cross entropy function, wherein the formula is as follows:
will output 037 And performing cross entropy evaluation on the two classification calculation results and the two classification target results of the input video annotation, and performing inverse gradient calculation. Weight and bias values in functions such as Conv2d, class2d, upsampled and the like are updated so that H (p, q) tends to be minimum, and an acquisition part indication and swab head recognition algorithm model can be obtained. Indicating the collecting position and swabThe acquisition part and the swab head indicated by the head identification algorithm are mixed with original pixel values by different colors respectively, so that the acquisition part and the swab head can be marked in a video frame, and a person to be detected is guided to perform acquisition operation.
As shown in fig. 3, the construction of the contact detection algorithm based on the result of step S3003 is continued, including the steps of:
s5001, set input 038 =relu(output 036 ) Input 'is set' 038 =relu(output’ 036 ) And perform a channel cat operation
S5002, set input 039 =output 038 And perform Conv2d operation
output 039 (N=8,C out =64,W=512,H=512)=Conv2d(input 039 ,N=8,C in =128,C out =64,W=512,H=512,P=1)
S5003, set input 040 =relu(output 039 ) And perform Conv2d operation
output 040 (N=8,C out =64,W=512,H=512)=Conv2d(input 040 ,N=8,C in =64,C out =64,W=512,H=512,P=1)
S5004, set input 041 =relu(output 040 ) And performs a network depth of 8 resnet 8 Calculation of
output 041 (N=8,C out =64,W=512,H=512)=resnet 8 (input 041 ,N=8,C in =64,W=512,H=512,P=1)
S5005, set input 042 =output 041 And carrying out characteristic probability result discrimination of fc_softmax, wherein discrimination results are a contact classification result and a non-contact classification result:
output 042 (N=8,o=2)=fc_softmax(input 042 (N=8,i=64))
s5006, obtaining a contact probability index of the pharyngeal swab and the pharyngeal collecting part in the current video frame through the steps S5001-S5005, wherein the value range is between (0-1).
As in step S3005, multiple sample training is performed using the cross entropy function as a loss function, and output is performed after step S5005 is completed 042 And performing cross entropy evaluation on the two-class calculation result and the two-class target result marked by contact and non-contact of the input picture, and performing inverse gradient calculation so as to optimize and update the weight tensor and the bias tensor in the resnet and the fc, so that the final cross entropy function output tends to 0.
After the contact detection algorithm provides the contact probability value, finally defining an action completion evaluation algorithm to realize the final determination of whether the sampling action is completed or not:
defining a decision function:
/>
defining a scoring function:
wherein fps is the current video stream frame rate, cp is the confidence probability, and the current acquisition action completion score is obtained after accumulating the judgment value J of the video frames within 2 seconds. A value in the score threshold range (0, fps x 2-1) may be set as the action completion threshold, and acquisition may be considered complete when score is greater than or equal to this threshold.
Compared with the prior art, the invention has the following beneficial effects:
the algorithm module provided by the invention can be used for an unattended self-help nucleic acid sampling pavilion, and can be matched with a monocular high-definition camera to realize the completion discrimination of actions in the self-help sampling process of a person to be detected. Compared with the mechanical arm type nucleic acid acquisition scheme, the mental pressure on the detected personnel is minimum, the one-time acquisition success rate and the acquisition efficiency are higher, and the construction cost is lower.
The invention integrates four algorithm modules (an acquisition part indication algorithm, a swab head identification algorithm, a contact detection algorithm and an action completion evaluation algorithm) into one video processing flow, and one set of flow can complete three tasks proposed by a nucleic acid acquisition process on an algorithm model, namely acquisition part indication, pharyngeal swab head identification and indication and action completion detection and evaluation.
Through hidden layer sharing and result utilization in the model, computing platform resources required by the algorithm module are saved to the greatest extent. The reasoning process of the algorithm module can be operated on low-end cpu such as intel i3 12100f, and the algorithm deployment cost is greatly saved due to no special requirement on the cpu, and the algorithm module also has the capability of being operated on mobile low-power consumption platforms such as android.
Compared with the common Unet model, the conv2d calculation does not change the size of the input feature map by introducing the convolution kernel and padding with special sizes in the conv2d calculation, so that the operation amount is reduced, and the model has stronger universality. Not only is suitable for nucleic acid detection scenes, but also is suitable for other self-service sample acquisition scenes.
The biological sample collection validity test method based on artificial intelligence is provided in detail. The description of the specific embodiments is only intended to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.

Claims (4)

1. An artificial intelligence-based biological sample collection validity test method is characterized by comprising the following steps:
s1, acquiring an oral cavity image and a pharyngeal swab acquisition process image of a user through a high-definition video camera;
s2, preprocessing the image to obtain a video frame;
s3, calculating polygon vertex coordinates of a pharyngeal collecting area through a collecting part indication algorithm;
s4, receiving an input video frame through a swab head identification algorithm, outputting polygon vertex coordinate information of a pharyngeal swab head in the video frame, and identifying a pharyngeal swab head part;
s5, comprehensively evaluating the contact degree and the contact position by combining an original image through a swab head and detection position contact detection algorithm, and giving corresponding scores;
s6, detecting and judging a plurality of continuous video frames of the video stream through an action completion evaluation algorithm;
the acquisition part indication algorithm is a UNet image segmentation algorithm, and the definition of a Unet network structure function of the UNet image segmentation algorithm comprises the following steps:
s301, defining a Conv2d two-dimensional graph convolution function, wherein the calculation formula is as follows:
wherein: x symbols are cross-correlation operators, N is the number of image frames that are input by one computation,
C in is the number of channels of the input image, C out Is the number of channels of the output image,
input represents the four-dimensional tensor of the input graph, the tensor shape is a quadruple,
output represents the output graph four-dimensional tensor,
the padding function represents expanding the input four-dimensional graph tensor size by P pixel values,
weight represents the convolution kernel weight parameter, a four-dimensional tensor, the tensor shape is a quadruple,
bias represents the bias weight tensor calculated by the current Conv2d, and the shape is consistent with weight;
s302, defining a classification function, wherein 1*1 convolution kernels are adopted for fixation, and the classification function is completely consistent with Conv2d, namely Class2d;
s303, defining a Maxpool function:
the max operation is to assign 2 x 2 kernels to perform element maximum value calculation according to step length 2, so that the size relation of input and output images can be obtained:
W out =W in /2
H out =H in /2
s304, defining a Relu function as an activation function of a conv2d function:
output(N,k,W,H)=relu(input(N,k,W,H))=max(0,input(N,k,W,H))
s305, defining an up sampling function of the sampled:
wherein Weight upsamples the Weight tensor;
s306, defining a channel feature map stacking function:
wherein C is out =C in1 +C in2 The +operator simply stacks the two feature map channels with the same input size;
s307, defining a feature map residual network action judging function resnet:
defining a residual network characteristic extraction function of a certain layer 1:
output l+1 =input l +F(input l +weight l )
where input is the input tensor, output tensor, size is determined by (N, C, W, H),
the F calculation comprises a single sequential calculation combination of arithmetic normalization for each element of the tensor, conv2d calculation, relu calculation,
continuing to define a residual network feature extraction function for any layer L starting from layer L:
s308 defines a full link layer fc and a softmax layer function fc_softmax,
the softmax function is:
the fc_softmax function is:
output o =fc_softmax(input i )=softmax(weight i,o ×input i +bais i,o )
where weight is the linear weight matrix and bias is the bias matrix;
after passing through the full connection layer, carrying out one-dimensional stretching on tensors with the sizes of (N, C, W, H) to obtain multi-batch one-dimensional tensors with the sizes of (N, C, W, H), and then calculating through a weight matrix and a bias matrix to obtain a final judging probability value which accords with the acquisition action completion standard and does not accord with the acquisition action completion standard;
the step of constructing the acquisition part indication algorithm comprises the following steps:
s3001, inputting a Yuv video frame input001 with 1920×1080 size through a video acquisition device;
s3002, performing image preprocessing, namely converting an input001 into a BGR channel, and performing central interception on an image to obtain tensor output data output002 of 512 x 3;
s3003, setting input video frame input n =relu(output n-1 ) Or input n =output n-1 Performing Conv2d, maxpool, upsampled and channel calculation until a 64-channel feature map of the tensor output data output036 acquisition part indication area is obtained;
s3004, repeating the step S3003 to obtain a 64-channel feature map of the swab head identification area, which is marked as input' 036
S3005, set input 037 =relu(output 036 ) And performing Class2d operation to obtain pixel point classification information of the acquisition part indication area and the swab head identification area respectively, wherein one classification is background pixel points and one classification is acquisition part or swab head pixel points.
2. The biological sample collection validity checking method based on artificial intelligence according to claim 1, wherein the cross entropy evaluation is performed on the pixel point classification information in step S3005 through a trained loss function and the inverse gradient calculation is performed;
the cross entropy function formula is:
will output 037 And performing cross entropy evaluation on the two classification calculation results and the two classification target results of the input video annotation, and performing inverse gradient calculation.
3. The artificial intelligence based biological specimen collection validity test method of claim 2, wherein the constructing of the contact detection algorithm based on the result of step S3003 includes the steps of:
s5001, set input 038 =relu(output 036 ) Input 'is set' 038 =relu(output’ 036 ) And carrying out a channel cat operation;
s5002, set input 039 =output 038 Performing Conv2d operation;
output 039 (N=8,C out =64,W=512,H=512)
=Conv2d(input 039 ,N=8,C in =128,C out
=64,W=512,H=512,P=1)
s5003, set input 040 =relu(output 039 ) Performing Conv2d operation;
output 040 (N=8,C out =64,W=512,H=512)=Conv2d(input 040 ,N=8,C in =64,C out =64,W=512,H=512,P=1)
s5004, set input 041 =relu(output 040 ) And performs a network depth of 8 resnet 8 Calculating;
output 041 (N=8,C out =64,W=512,H=512)=resnet 8 (input 041 ,N=8,C in =64,W=512,H=512,P=1)
s5005, set input 042 =output 041 And carrying out characteristic probability result discrimination of fc_softmax, wherein the discrimination result is a contact/non-contact classification result;
output 042 (N=8,o=2)=fc_softmax(input 042 (N=8,i=64))
s5006, obtaining a contact probability index of the pharyngeal swab and the pharyngeal collecting part in the current video frame through the steps S5001-S5005, wherein the value range is between (0-1).
4. The artificial intelligence based biological sample collection validity test method of claim 3, wherein after the contact detection algorithm provides the contact probability value, an action completion evaluation algorithm is finally defined to realize the final determination of whether the sampling action is completed;
defining a decision function:
defining a scoring function:
wherein fps is the current video stream frame rate, cp is the confidence probability, and the current acquisition action completion score is obtained after accumulating the judgment value J of the video frames within 2 seconds.
CN202310121257.1A 2023-02-16 2023-02-16 Biological sample collection validity checking method based on artificial intelligence Active CN116109982B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310121257.1A CN116109982B (en) 2023-02-16 2023-02-16 Biological sample collection validity checking method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310121257.1A CN116109982B (en) 2023-02-16 2023-02-16 Biological sample collection validity checking method based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN116109982A CN116109982A (en) 2023-05-12
CN116109982B true CN116109982B (en) 2023-07-28

Family

ID=86261361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310121257.1A Active CN116109982B (en) 2023-02-16 2023-02-16 Biological sample collection validity checking method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116109982B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378804A (en) * 2021-08-12 2021-09-10 中国科学院深圳先进技术研究院 Self-service sampling detection method and device, terminal equipment and storage medium
WO2022123069A1 (en) * 2020-12-11 2022-06-16 Sensyne Health Group Limited Image classification of diagnostic tests
CN114841990A (en) * 2022-05-26 2022-08-02 长沙云江智科信息技术有限公司 Self-service nucleic acid collection method and device based on artificial intelligence

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160151052A1 (en) * 2014-11-26 2016-06-02 Theranos, Inc. Methods and systems for hybrid oversight of sample collection
WO2019145951A1 (en) * 2018-01-23 2019-08-01 Mobileodt Ltd. Automated monitoring of medical imaging procedures
WO2019239414A1 (en) * 2018-06-13 2019-12-19 Mobileodt, Ltd. Automated detection in cervical imaging
EP3837525A4 (en) * 2018-08-16 2023-03-08 Essenlix Corporation Image-based assay using intelligent monitoring structures
US20230260627A1 (en) * 2020-05-06 2023-08-17 Tyto Care Ltd. A remote medical examination system and method
CN111643125A (en) * 2020-06-23 2020-09-11 桂林医学院 System for automatically collecting nucleic acid sample and collection method thereof
CN111803139A (en) * 2020-08-26 2020-10-23 深圳智慧林网络科技有限公司 Self-service type pharyngeal test seed nucleic acid sampling device and self-service nucleic acid sampling method
CN113143342A (en) * 2021-03-25 2021-07-23 香港中文大学(深圳) Method for determining oral sampling site, sampling robot and computer storage medium
CN114998230A (en) * 2022-05-23 2022-09-02 肇庆学院 Pharynx swab oral cavity nucleic acid sampling area image identification method
CN115100388A (en) * 2022-06-16 2022-09-23 新石器慧通(北京)科技有限公司 Self-service nucleic acid sampling system and method
CN115089223A (en) * 2022-06-16 2022-09-23 国研软件股份有限公司 Throat swab collecting cotton swab and detection method
CN114926772B (en) * 2022-07-14 2022-10-21 河南科技学院 Method for tracking and predicting trajectory of throat swab head
CN114916964B (en) * 2022-07-14 2022-11-04 河南科技学院 Pharynx swab sampling effectiveness detection method and self-service pharynx swab sampling method
CN115337044B (en) * 2022-07-18 2023-06-09 深圳市安保数字感控科技有限公司 Nucleic acid sampling monitoring method, device, system and computer readable storage medium
CN115478004A (en) * 2022-07-25 2022-12-16 上海柯钒智能设备有限公司 Self-service nucleic acid detection device and self-service nucleic acid detection method
CN115229793A (en) * 2022-07-26 2022-10-25 闻泰通讯股份有限公司 Sampling method and device, equipment and storage medium
CN115424296A (en) * 2022-08-08 2022-12-02 山东浪潮超高清智能科技有限公司 Human tonsil region detection system based on target detection and side deployment method
CN115457422A (en) * 2022-08-12 2022-12-09 北京声智科技有限公司 Sampling process verification method and device and augmented reality function glasses
CN115439929A (en) * 2022-08-17 2022-12-06 每日互动股份有限公司 Nasal swab collection action determination method and storage medium in antigen detection process
CN115414072A (en) * 2022-08-31 2022-12-02 美的集团(上海)有限公司 Pharynx swab sampling method and device, sampling equipment and computer program product

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022123069A1 (en) * 2020-12-11 2022-06-16 Sensyne Health Group Limited Image classification of diagnostic tests
CN113378804A (en) * 2021-08-12 2021-09-10 中国科学院深圳先进技术研究院 Self-service sampling detection method and device, terminal equipment and storage medium
CN114841990A (en) * 2022-05-26 2022-08-02 长沙云江智科信息技术有限公司 Self-service nucleic acid collection method and device based on artificial intelligence

Also Published As

Publication number Publication date
CN116109982A (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN110188615B (en) Facial expression recognition method, device, medium and system
CN101271514B (en) Image detection method and device for fast object detection and objective output
CN110647875B (en) Method for segmenting and identifying model structure of blood cells and blood cell identification method
CN110532970B (en) Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces
CN107463920A (en) A kind of face identification method for eliminating partial occlusion thing and influenceing
CN106446930A (en) Deep convolutional neural network-based robot working scene identification method
CN112801015B (en) Multi-mode face recognition method based on attention mechanism
CN111539351B (en) Multi-task cascading face frame selection comparison method
CN115719516A (en) Multichannel-based classroom teaching behavior identification method and system
CN115147347A (en) Method for detecting surface defects of malleable cast iron pipe fitting facing edge calculation
Nazeer et al. Real time object detection and recognition in machine learning using jetson nano
CN116109982B (en) Biological sample collection validity checking method based on artificial intelligence
CN112200065B (en) Micro-expression classification method based on action amplification and self-adaptive attention area selection
CN112084913B (en) End-to-end human body detection and attribute identification method
GB2604706A (en) System and method for diagnosing small bowel cleanliness
CN116630163A (en) Method for reconstructing super-resolution of self-adaptive endoscope image
CN111104921A (en) Multi-mode pedestrian detection model and method based on Faster rcnn
CN113343770B (en) Face anti-counterfeiting method based on feature screening
CN115294467A (en) Detection method and related device for tea diseases
CN114038041A (en) Micro-expression identification method based on residual error neural network and attention mechanism
CN113052087A (en) Face recognition method based on YOLOV5 model
CN112861724A (en) Traditional Chinese medicine identification system
CN114445364B (en) Fundus image microaneurysm region detection method and imaging method thereof
EP4198818A1 (en) Method to design a privacy preserving object detection process, object detector and object detection system
Christobel Underage driving detection-age recognition using face detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant