CN116777295B - Medicine traceability system and method based on data intelligence - Google Patents

Medicine traceability system and method based on data intelligence Download PDF

Info

Publication number
CN116777295B
CN116777295B CN202310815511.8A CN202310815511A CN116777295B CN 116777295 B CN116777295 B CN 116777295B CN 202310815511 A CN202310815511 A CN 202310815511A CN 116777295 B CN116777295 B CN 116777295B
Authority
CN
China
Prior art keywords
medical
production process
data
medicine
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310815511.8A
Other languages
Chinese (zh)
Other versions
CN116777295A (en
Inventor
董锴
孙娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhimi Pharmaceutical Technology Co ltd
Original Assignee
Shanghai Zhimi Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhimi Pharmaceutical Technology Co ltd filed Critical Shanghai Zhimi Pharmaceutical Technology Co ltd
Priority to CN202310815511.8A priority Critical patent/CN116777295B/en
Publication of CN116777295A publication Critical patent/CN116777295A/en
Application granted granted Critical
Publication of CN116777295B publication Critical patent/CN116777295B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a medicine traceability system and a medicine traceability method based on data intelligence, wherein medicine data are acquired by using a block chain network; performing natural language processing based on a deep convolutional neural network on the medical data to obtain a medical verification feature matrix; and determining whether a problem exists in the medicine production process based on the medicine verification feature matrix. Therefore, the comprehensive tracking and supervision of the medicine production process can be realized by using a natural language processing technology and a blockchain technology based on deep learning, and the medicine production quality and safety are improved.

Description

Medicine traceability system and method based on data intelligence
Technical Field
The invention relates to the technical field of intelligent traceability, in particular to a medicine traceability system and a medicine traceability method based on data intelligence.
Background
The medicine production is a complex process, involves a plurality of links and multiple parties, and how to ensure the quality and the safety of the medicine is an important problem.
The traditional medicine tracing method relies on manual recording and checking, and has the problems of low efficiency, high error rate, unreliable data and the like. Thus, a solution is desired.
Disclosure of Invention
The embodiment of the invention provides a medicine traceability system and a medicine traceability method based on data intelligence, wherein medicine data are acquired by using a block chain network; performing natural language processing based on a deep convolutional neural network on the medical data to obtain a medical verification feature matrix; and determining whether a problem exists in the medicine production process based on the medicine verification feature matrix. Therefore, the comprehensive tracking and supervision of the medicine production process can be realized by using a natural language processing technology and a blockchain technology based on deep learning, and the medicine production quality and safety are improved.
The embodiment of the invention also provides a medicine traceability system based on data intelligence, which comprises:
The medical data acquisition module is used for acquiring medical data by utilizing a block chain network;
the natural language processing module is used for carrying out natural language processing based on the deep convolutional neural network on the medical data so as to obtain a medical check feature matrix; and
And the traceability result generation module is used for determining whether the medicine production process has problems or not based on the medicine check feature matrix.
The embodiment of the invention also provides a medicine tracing method based on data intelligence, which comprises the following steps:
acquiring medical data by using a block chain network;
Performing natural language processing based on a deep convolutional neural network on the medical data to obtain a medical verification feature matrix; and
Based on the medicine verification feature matrix, whether a problem exists in the medicine production process is determined.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
fig. 1 is a block diagram of a medical traceability system based on data intelligence according to an embodiment of the present invention.
Fig. 2 is a block diagram of the natural language processing module in the medical traceability system based on data intelligence according to the embodiment of the present invention.
Fig. 3 is a flowchart of a medicine tracing method based on data intelligence provided in an embodiment of the present invention.
Fig. 4 is a schematic diagram of a system architecture of a medicine tracing method based on data intelligence according to an embodiment of the present invention.
Fig. 5 is an application scenario diagram of a medical traceability system based on data intelligence provided in an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present invention and their descriptions herein are for the purpose of explaining the present invention, but are not to be construed as limiting the invention.
Unless defined otherwise, all technical and scientific terms used in the embodiments of the application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.
In describing embodiments of the present application, unless otherwise indicated and limited thereto, the term "connected" should be construed broadly, for example, it may be an electrical connection, or may be a communication between two elements, or may be a direct connection, or may be an indirect connection via an intermediate medium, and it will be understood by those skilled in the art that the specific meaning of the term may be interpreted according to circumstances.
It should be noted that, the term "first\second\third" related to the embodiment of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, it is to be understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate such that embodiments of the application described herein may be practiced in sequences other than those illustrated or described herein.
In one embodiment of the present invention, fig. 1 is a block diagram of a medical traceability system based on data intelligence according to an embodiment of the present invention. As shown in fig. 1, a medical traceability system 100 based on data intelligence according to an embodiment of the present invention includes: a medical data acquisition module 110 for acquiring medical data using a blockchain network; the natural language processing module 120 is configured to perform natural language processing on the medical data based on the deep convolutional neural network to obtain a medical verification feature matrix; and a traceability result generation module 130, configured to determine whether a problem exists in the pharmaceutical production process based on the pharmaceutical verification feature matrix.
The medical data acquisition module 110 is a data acquisition mode based on a blockchain network, and related information of the medical can be acquired through the medical data acquisition module, such as manufacturer, batch number, date of production, etc. The blockchain technology has the characteristics of decentralization, non-tampering, traceability and the like, and can effectively ensure the authenticity and credibility of the medical data. The module can acquire medical data from different sources by accessing different blockchain networks, so as to realize comprehensive and accurate data collection.
The natural language processing module 120 is a natural language processing technology based on a deep convolutional neural network, and is used for processing the medical data and extracting a medical verification feature matrix. The natural language processing module can identify key information in the medical data so as to generate a medical verification feature matrix. The medicine checking feature matrix has higher accuracy and reliability, and can be used for subsequent medicine tracing and checking.
The traceability result generation module 130 is a traceability result generation technology based on a medical verification feature matrix, and is used for determining whether a problem exists in the medical production process. The traceability result generation module can analyze the medicine check feature matrix and compare records of the medicine production process, so that whether the medicine production process is abnormal or not is judged. If the abnormality exists, the module can timely generate a tracing result and warn, so that the safety and reliability of the medicine production process are ensured.
The medical traceability system based on the data intelligence is a medical traceability system integrated with the blockchain, natural language processing and traceability result generation technology, has higher data reliability and traceability accuracy, and can effectively guarantee the safety and reliability of the medical production process. Specifically, firstly, the medicine tracing system based on data intelligence adopts a blockchain technology, which is a decentralised distributed database technology, and each link in the medicine production process can be recorded, so that the whole life cycle tracing of medicine is realized. The block chain technology has the characteristics of decentralization, non-tampering, safety, reliability and the like, and can effectively ensure the safety and reliability of the medicine production process. And secondly, the medical traceability system based on data intelligence also adopts a natural language processing technology, and can analyze and process text information in the medical production process, thereby improving the reliability and accuracy of data. The natural language processing technology can convert text information in the medicine production process into structured data, so that subsequent analysis and processing are convenient. Finally, the medical traceability system based on the data intelligence also adopts traceability result generation technology, and can generate traceability results according to the data in the medical production process, so that the situation in the medical production process is known. The traceability result generation technology can analyze and process the data in the medicine production process, so that a visual result is generated, and the visual result is convenient to view and understand.
According to the application, the medicine tracing system based on data intelligence can ensure the medication safety of patients, and can trace back the links of production, circulation, sales and the like of each medicine, discover problems in time and take measures to ensure the medication safety of patients. In the second aspect, the medical market order can be ensured, the medical traceability system can monitor the medical market, counterfeit and inferior medicines can be prevented from entering the market, and the medical market order is maintained. In the third aspect, the medicine management efficiency can be improved, and the medicine traceability system can automatically record, trace and manage medicine information, so that the medicine management efficiency is improved, and the cost of manpower and material resources is reduced. In the fourth aspect, the competitiveness of the pharmaceutical enterprise can be enhanced, and the pharmaceutical traceability system can improve the management level and the product quality of the pharmaceutical enterprise and enhance the competitiveness of the enterprise.
In other words, the medicine traceability system based on data intelligence is a very important medicine supervision measure, can effectively guarantee the medication safety of patients, maintain the medicine market order, improve the medicine management efficiency, strengthen the competitiveness of medicine enterprises, and has important significance for promoting the healthy development of medicine industry and guaranteeing the physical health of people.
Specifically, the medical data acquisition module 110 is configured to acquire medical data using a blockchain network. Wherein the medical data comprises medical production process data and medical production standard procedures. Aiming at the technical problems, the technical conception of the application is as follows: the comprehensive tracking and supervision of the medicine production process are realized by using a natural language processing technology and a blockchain technology based on deep learning, and the medicine production quality and safety are improved.
Specifically, in the technical scheme of the application, firstly, the data of the medicine production process is downloaded from a blockchain network, and the medicine production standard flow is obtained. It should be appreciated that in a pharmaceutical manufacturing process, multiple parties may generate associated manufacturing data, such as from manufacturers, wholesalers to pharmacies, hospitals, etc., whose interactions involve multiple links. The traditional tracing method usually relies on manual recording and transmission, has the problems of incomplete data, repeated data, manual operation and the like, and is difficult to ensure the reliability and the effectiveness of the data. The block chain can ensure the consistency and the integrity of the data on each node by decentralizing storage and verification of the data, meanwhile, the possibility of manually tampering the data in an intermediate link is avoided, and the reliability of the data is ensured. In addition, the pharmaceutical production specification flow describes the standard and specification of pharmaceutical production, and is an important reference and guiding basis for pharmaceutical production. And acquiring a medical production standard flow, and utilizing semantic information of the standard flow, each link of the production process can be better monitored and checked.
Where blockchain is a decentralized distributed database technology that can record transaction data and other types of information and is not tamperable. A blockchain network is a network of multiple nodes, each of which can record and verify transaction data and ensure the security and integrity of the network through a consensus algorithm.
Blockchain networks may be used in a variety of applications, such as finance, logistics, medicine, copyright protection, and the like. In the financial field, blockchains can be used to implement decentralised digital currency transactions, improving transaction efficiency and security. In the logistics field, the blockchain can be used for realizing transparency of cargo tracking and transaction records, improving logistics efficiency and reducing fraud. In the medical field, the blockchain can be used for realizing safe sharing and privacy protection of medical data and improving the quality and efficiency of medical service. In the field of copyright protection, a blockchain can be used for protecting and managing digital copyright, and rights and interests protection and income of copyrighters are improved.
It should be understood that the blockchain technology can record and track various links in the pharmaceutical production process, including raw material purchasing, production and processing, quality inspection and acceptance, and the like, can realize the transparency of the pharmaceutical production process, improve the traceability and credibility of the production process, and is helpful for guaranteeing the quality and safety of pharmaceutical production. Moreover, through the blockchain technology, the supervision department can acquire the data of the medicine production process in real time, including the data of each links such as raw material purchase, production and processing, quality inspection and acceptance, can carry out real-time supervision and risk control to medicine enterprises, helps guaranteeing patient medication safety and maintains medicine market order. Through the block chain technology, each link in the medicine production process can realize data sharing and cooperation, can optimize the production flow and improve production efficiency, is favorable for reducing production cost and improving the competitiveness of enterprises. Specifically, the natural language processing module 120 is configured to perform natural language processing on the medical data based on a deep convolutional neural network to obtain a medical verification feature matrix. Fig. 2 is a block diagram of the natural language processing module in the medical traceability system based on data intelligence according to an embodiment of the present invention, as shown in fig. 2, the natural language processing module 120 includes: a preprocessing unit 121, configured to perform data preprocessing on the medical data to obtain a sequence of production process descriptors and a sequence of medical production specification flow descriptors; the semantic analysis and understanding unit 122 is configured to perform embedded encoding and semantic understanding on the sequence of the production process descriptor and the sequence of the pharmaceutical production specification flow descriptor to obtain a production process semantic understanding feature vector and a pharmaceutical production specification flow semantic understanding feature vector; and a semantic association unit 123, configured to perform association encoding on the production process semantic understanding feature vector and the pharmaceutical production specification flow semantic understanding feature vector to obtain the pharmaceutical verification feature matrix.
First, the preprocessing unit 121 includes: a production process description processing subunit, configured to perform data cleaning and word segmentation processing on the pharmaceutical production process data to obtain a sequence of the production process description words; and the medical production specification flow description subunit is used for carrying out data cleaning and word segmentation processing on the medical production specification flow to obtain a sequence of the medical production specification flow description words.
In consideration of the fact that the medical production process data cannot be directly identified by a computer, in the technical scheme of the application, the medical production process data is subjected to data cleaning and word segmentation processing to obtain a sequence of production process description words. In this way, the sequence of production process descriptors can be automatically identified and processed by the computer. Specifically, the data cleansing refers to removing clutter and garbage in data, including noise, repeated data and the like; the word segmentation process refers to the process of segmenting a sentence or a piece of text into meaningful words for processing by a computer. Likewise, the medical production specification flow also requires data preprocessing to be identified and processed by the computer. That is, the medical production specification process is subjected to data cleaning and word segmentation processing to obtain a sequence of medical production specification process descriptors. It should be noted that the standard flow of medical production is usually a file with complete and standard format, and in the process of cleaning data, useless information in the standard file, including a header, a footer and the like, is removed, only the essential content of the standard flow is reserved, and the processing efficiency of the computer is improved.
In one embodiment of the application, the data cleansing includes the steps of: 1. removing special characters: special characters in the text, such as punctuation marks, numbers, spaces, etc., are removed. 2. Removing stop words: the words used for removing the text, such as "yes", and "in" are commonly used words, which have no practical meaning for text analysis. 3. Processing spelling errors: correction of spelling errors in text is performed, for example, changing "hospitl" to "hotplate". 4. Treatment abbreviations: abbreviations in the text are reduced, for example, "USA" to "United States of America".
Further, the word segmentation process includes the steps of: 1. word segmentation: the text is divided into individual words, called segmentation. The word segmentation may use common word segmentation tools such as jieba, NLTK, etc. 2. Removing word stems: the stem of the word after segmentation is removed, and for example, "running" and "run" are regarded as the same word. 3. Part of speech tagging: part of speech tagging is performed on the segmented words, for example, "running" is tagged as a verb, and "book" is tagged as a noun, etc. 4. Removing low frequency words: words with low frequency of occurrence in the text, such as words which occur only once or twice, are removed.
It should be appreciated that performing data cleansing and word segmentation processing on pharmaceutical production process data may remove noise and redundant information from the raw data while converting the textual data into a sequence of machine-readable words. This helps to improve the quality and accuracy of the data and provides a basis for subsequent data analysis and mining. Through the processing, each link and each step in the medicine production process can be better understood, so that the whole production process can be better monitored and managed, and the production efficiency and the product quality are improved.
Similarly, the data cleaning and word segmentation processing are carried out on the standard flow of the medical production, so that the word description in the standard flow can be converted into a word sequence readable by a machine, and a foundation is provided for the subsequent analysis and excavation of the standard flow. This helps to better understand the various steps and requirements in the pharmaceutical production specification flow, thereby better controlling the production process, ensuring product quality and compliance. Then, in the semantic analysis and understanding unit 122, it includes: a production process semantic understanding subunit, configured to pass the sequence of production process descriptors through a first natural language processing model that includes a word embedding layer to obtain the production process semantic understanding feature vector; and the standard flow semantic understanding subunit is used for enabling the sequence of the medical production standard flow description words to pass through a second natural language processing model containing a word embedding layer to obtain the medical production standard flow semantic understanding feature vector.
Further, the production process semantic understanding subunit is configured to: mapping each production process descriptor in the sequence of production process descriptors to a word vector using the word embedding layer of the first natural language processing model including the word embedding layer to obtain the production process semantic understanding feature vector.
The sequence of process descriptors is then passed through a first natural language processing model that includes a word embedding layer to obtain a process semantic understanding feature vector. The word embedding layer can perform embedded coding on the sequence of the production process descriptors to convert the text description into a vector form which can be processed by a computer, so that the computer can better understand the semantic content of the production process.
In an embodiment of the present application, the first natural language processing model is a context encoder. That is, the sequence of process descriptors is globally context-semantically understood with a context encoder, thereby capturing an overall semantic representation of the pharmaceutical production process. It is worth mentioning that the core of the context encoder is the BERT model, which can use all lexical information in the text at the same time when processing the text, instead of just accumulating the previous information as in a conventional recurrent neural network. That is, the BERT model can better capture the dependency relationship and semantic information among the words in the text, and the expressive power of the model is improved.
And similarly, passing the sequence of the medical production specification flow description words through a second natural language processing model containing a word embedding layer to obtain the medical production specification flow semantic understanding feature vector. The second natural language processing model also adopts a context encoder based on the BERT model to capture the long-distance semantic association representation in the text description of the pharmaceutical production standard flow by using the BERT model.
A natural language processing model that includes a word embedding layer refers to a model that converts text data into a machine-readable word vector representation in a natural language processing task. Word embedding refers to converting each word in the original text data into a high-dimensional vector such that the vectors can represent semantic and grammatical relationships between words. In this way, natural language processing tasks can be performed by computing the similarity between vectors.
In the medical traceability system, two natural language processing models comprising word embedding layers are used for processing the production process description and the medical production specification flow description respectively. These models can translate raw text data into machine-readable feature vectors, thereby enabling semantic understanding of the production process and specification flow. In this way, individual steps and specification requirements in the production process can be better understood and analyzed, thereby better managing and monitoring the overall production process.
Specifically, in one embodiment of the present application, the sequence of production process descriptors is passed through a first natural language processing model that includes a word embedding layer to obtain the production process semantic understanding feature vector for converting the description of the production process into a computer-understandable vector form for subsequent computation and processing. It comprises the following steps: the sequence of production process descriptors is subjected to word segmentation processing and is converted into individual words. Each word is converted into a vector using word embedding techniques, representing the position of the word in semantic space. And carrying out average pooling on vectors of all words to obtain a vector representation of the production process description, namely, the semantic understanding feature vector of the production process.
Further, the sequence of the medical production specification flow descriptor is processed through a second natural language processing model containing a word embedding layer to obtain the semantic understanding feature vector of the medical production specification flow, and the semantic understanding feature vector is used for converting the description of the medical production specification flow into a vector form which can be understood by a computer so as to facilitate subsequent calculation and processing. It comprises the following steps: the sequence of the medical production specification flow description words is subjected to word segmentation processing and is converted into individual words. Each word is converted into a vector using word embedding techniques, representing the position of the word in semantic space. And carrying out average pooling on vectors of all words to obtain a vector representation of the pharmaceutical production specification flow description, namely semantic understanding feature vectors of the pharmaceutical production specification flow.
Finally, in the semantic association unit 123, it includes: the associated coding subunit is used for carrying out associated coding on the semantic understanding feature vector in the production process and the semantic understanding feature vector in the medical production standard flow so as to obtain an initial medical check feature matrix; the multisource information fusion pre-verification distribution evaluation optimization subunit is used for carrying out multisource information fusion pre-verification distribution evaluation optimization on each row of feature vectors in the initial medicine verification feature matrix so as to obtain a plurality of optimized post-row feature vectors; and an arrangement subunit, configured to arrange the plurality of optimized post-row feature vectors into the pharmaceutical check feature matrix.
Wherein, the association coding is a process of associating two or more feature vectors to obtain a new feature vector. Associated codes include, but are not limited to: and splicing the codes (Concatenation Encoding) to splice the two feature vectors to obtain a new feature vector. And (Weighted Sum Encoding) carrying out weighted sum coding on the two feature vectors to obtain a new feature vector. Product Encoding (Product Encoding) multiplies the two feature vectors element by element to obtain a new feature vector. Dot product encoding (Dot Product Encoding) performs a dot product operation on the two feature vectors to obtain a scalar value as an element of the new feature vector. And (3) encoding (Neural Network Encoding) the neural network, wherein the neural network is used for encoding the two eigenvectors to obtain a new eigenvector. And carrying out association coding on the semantic understanding feature vector in the production process and the semantic understanding feature vector in the medical production standard flow so as to obtain a medical check feature matrix. That is, the comprehensive semantic expression between the production process semantic understanding feature vector and the pharmaceutical production specification flow semantic understanding feature vector is constructed in a correlated coding mode. In other words, the medicine verification feature matrix can reflect the matching degree between the actual medicine production process and the standard production flow.
In another embodiment of the present application, a method for performing associated coding on a production process semantic understanding feature vector and a pharmaceutical production specification flow semantic understanding feature vector includes: firstly, splicing a production process semantic understanding feature vector and a medical production standard process semantic understanding feature vector into a new feature vector to serve as input of a medical verification feature vector; then, carrying out normalization processing on the spliced feature vectors so that each feature value is between 0 and 1; then, taking the normalized feature vector as input, and training through a neural network model to obtain a medical check feature matrix; finally, the new medical product is checked, thereby ensuring the quality and compliance of the medical product.
And carrying out association coding on the semantic understanding feature vector in the production process and the semantic understanding feature vector in the medical production standard flow, and establishing association between the semantic understanding feature vector and the semantic understanding feature vector to obtain a medical verification feature matrix. Moreover, the medicine verification feature matrix can reflect various information in the medicine production process more comprehensively and accurately, including information in various links, standard flows, quality control and the like in the production process. The information can be used in a medicine traceability system, so that the supervision and the consumers can better know the production process and quality condition of the medicine, and the health and safety of people are guaranteed.
In the technical scheme of the application, when the production process semantic understanding feature vector and the medicine production standard flow semantic understanding feature vector are subjected to association coding to obtain the medicine verification feature matrix, the production process semantic understanding feature vector and the medicine production standard flow semantic understanding feature vector are associated position by position, so that each row of feature vector of the medicine verification feature matrix can be regarded as an associated feature vector of each feature value of the production process semantic understanding feature vector and the whole medicine production standard flow semantic understanding feature vector, and the medicine verification feature matrix is equivalent to a combined feature set of a local feature set corresponding to each row of feature vector.
And, since the feature distribution of each row of feature vectors has text semantic association feature distribution of the pharmaceutical production process data expressed by the production process semantic understanding feature vector between the feature distributions, each row of feature vectors has a multi-source information association relationship corresponding to the text semantic distribution information of the pharmaceutical production process data in addition to the neighborhood distribution relationship associated with each other.
Therefore, in order to promote the expression effect of the pharmaceutical check feature matrix as a whole on the text semantic association features of the pharmaceutical production process data, the applicant of the present application performs multisource information fusion pre-verification distribution evaluation optimization on each line feature vector, for example, denoted as V i, so as to obtain an optimized line feature vector V i, which is specifically expressed as: carrying out multisource information fusion pre-verification distribution evaluation optimization on each line feature vector in the initial medicine verification feature matrix by using an optimization formula to obtain a plurality of optimized line feature vectors; wherein, the optimization formula is:
wherein V j is the j-th row feature vector in the initial pharmaceutical check feature matrix, Is the mean feature vector, n is
The neighborhood sets up the hyper-parameters, log represents a logarithmic function based on 2,Representing per-position subtraction, V i is the i-th optimized post-row feature vector of the plurality of optimized post-row feature vectors.
Here, the optimization of the multisource information fusion pre-verification distribution evaluation can be used for realizing effective folding of the pre-verification information of each feature vector on the local synthesis distribution based on the quasi-maximum likelihood estimation of the feature distribution fusion robustness for the feature local collection formed by a plurality of mutually-associated neighborhood parts, and the optimization paradigm of standard expected fusion information which can be used for evaluating the internal association in the collection and the change relation between the collection is obtained through the pre-verification distribution construction under the multisource condition, so that the information expression effect of the feature vector fusion based on the multisource information association is improved. Therefore, the optimized row feature vector V i is arranged as the medicine verification feature matrix, and the expression effect of the medicine verification feature matrix as a whole on text semantic association features with different scales of medicine production process data can be improved.
Further, arranging the plurality of optimized post-row feature vectors into a medical check feature matrix, comprising: the method comprises the steps of arranging a plurality of row feature vectors into a matrix according to a certain sequence, wherein each row in the matrix corresponds to one row feature vector. Each row in the matrix is normalized such that the elements of each row are in the range of 0, 1. And weighting each column in the matrix, and giving different weights to different columns to obtain a weighted matrix. And performing dimension reduction processing on the weighted matrix, and converting the weighted matrix into a low-dimension feature matrix. And carrying out cluster analysis on the feature matrix, and dividing the feature matrix into different categories, wherein each category represents a medical product. And (3) checking the medicine of each category, and checking whether the production process meets the standard requirements or not, so that the quality and the safety of the medicine product are ensured.
Therefore, various information in the medicine production process can be integrated to form a comprehensive feature matrix, so that medicine check and tracing can be more accurately performed. Meanwhile, the method can also improve the transparency and traceability of the medicine production process and ensure the health and safety of people.
Specifically, the traceability result generation module 130 is configured to determine whether a problem exists in the pharmaceutical production process based on the pharmaceutical verification feature matrix. The traceability result generation module is used for: and the medical verification feature matrix passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether a problem exists in the medical production process.
And then, the medicine check feature matrix passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether a problem exists in a medicine production process. Specifically, the classifier takes the medicine check feature matrix as input, and marks the input sample with labels of 'medicine production process has problems' and 'medicine production process has no problems' according to different features so as to indicate whether the medicine production process has problems. In the process, the classifier judges whether the input data has problems in the medicine production process by learning and positioning important features in the medicine verification feature matrix. In practical application, the classification result can be used for guiding and solving the problems existing in the production process in time, so that the quality and safety of medicine production are improved.
The traceback result generating module 130 includes: the matrix unfolding unit is used for unfolding the medical check feature matrix into a classification feature vector according to a row vector or a column vector; the full-connection coding unit is used for carrying out full-connection coding on the classification characteristic vectors by using a plurality of full-connection layers of the classifier so as to obtain coded classification characteristic vectors; and the classification unit is used for passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
A classifier is a machine learning model used to assign input data to different classes. In a medicine traceability system, a classifier is used to judge whether a problem exists in the medicine production process so as to discover and solve the problem in time. In another embodiment of the present application, it includes: and combining the semantic understanding feature vector of the production process and the semantic understanding feature vector of the medical production standard flow into a feature matrix according to a certain rule. And (3) cleaning and preprocessing the data of the feature matrix to remove noise and improve the data quality. And classifying the feature matrix by using a trained classifier model to obtain a classification result. The classification result is used to indicate whether there is a problem in the pharmaceutical production process. If the classification result is normal, the medicine production process is not problematic; if the classification result is abnormal, the medicine production process is problematic, and the medicine production process needs to be processed in time.
The classifier may employ various machine learning algorithms such as Support Vector Machines (SVMs), decision trees, random forests, and the like. The classifier can classify the medicine check feature matrix to judge whether the medicine production process has problems. Specifically, feature vectors of known pass and fail pharmaceutical production processes are input into a classifier, allowing the classifier to learn how to classify these feature vectors into two classes. Then, when a new pharmaceutical production process is encountered, the feature vector of the production process may be input into a classifier, which determines whether the production process is acceptable according to the learned rules. Therefore, whether the medicine production process has problems or not can be rapidly and accurately judged, and the health and safety of people are ensured.
In the application, the application of the deep learning-based natural language processing technology and the blockchain technology in a medicine traceability system has important significance, and a large amount of text data is involved in the medicine production process, and the data needs to be cleaned and word-segmented for subsequent analysis and processing. The deep learning-based natural language processing technology can realize efficient and accurate data cleaning and word segmentation processing, and improves the usability and the interpretability of data. Data in pharmaceutical manufacturing processes is often unstructured and needs to be converted into structured feature vectors for subsequent analysis and processing. The deep learning-based natural language processing technology can realize efficient and accurate feature extraction and coding, and improves the usability and the interpretability of data. Data in the pharmaceutical production process needs to be guaranteed to be non-tamper-proof and safe so as to achieve comprehensive traceability and supervision. The distributed account book based on the blockchain technology can realize the non-tamper property and the security of data, and can realize comprehensive tracing and supervision, thereby improving the production quality and the security of medicines. The classifier can classify the medicine check feature matrix to judge whether the medicine production process has problems. Specifically, using a classifier, feature vectors of known qualified and unqualified pharmaceutical production processes are input into the classifier, so that the classifier learns how to divide the feature vectors into two types to determine whether the production process is qualified. Therefore, the comprehensive tracking and supervision of the medicine production process can be realized, the medicine production quality and safety are improved, and the health and safety of people are ensured.
In summary, the data intelligence-based medical traceability system 100 according to the embodiment of the present invention is illustrated, which uses deep learning-based natural language processing technology and blockchain technology to implement comprehensive tracing and supervision of the medical production process, and improve the quality and safety of medical production.
As described above, the data-intelligence-based medicine tracing system 100 according to the embodiment of the present invention may be implemented in various terminal devices, for example, a server or the like for data-intelligence-based medicine tracing. In one example, the data intelligence based medical traceback system 100 according to an embodiment of the present invention may be integrated into the terminal device as one software module and/or hardware module. For example, the data intelligence based medical trace back system 100 may be a software module in the operating system of the terminal device or may be an application developed for the terminal device; of course, the data intelligence based medical traceability system 100 can also be one of a plurality of hardware modules of the terminal device.
Alternatively, in another example, the data-intelligence based medical trace back system 100 and the terminal device may be separate devices, and the data-intelligence based medical trace back system 100 may be connected to the terminal device via a wired and/or wireless network and communicate the interaction information in accordance with a agreed data format.
Fig. 3 is a flowchart of a medicine tracing method based on data intelligence provided in an embodiment of the present invention. Fig. 4 is a schematic diagram of a system architecture of a medicine tracing method based on data intelligence according to an embodiment of the present invention. As shown in fig. 3 and 4, a medicine tracing method based on data intelligence includes: 210, acquiring medical data by using a blockchain network; 220, performing natural language processing based on a deep convolutional neural network on the medical data to obtain a medical verification feature matrix; and, 230, determining whether a problem exists in the medicine production process based on the medicine verification feature matrix.
It will be appreciated by those skilled in the art that the specific operation of the respective steps in the above-described data intelligence-based medicine tracing method has been described in detail in the above description of the data intelligence-based medicine tracing system with reference to fig. 1 to 2, and thus, repetitive descriptions thereof will be omitted.
Fig. 5 is an application scenario diagram of a medical traceability system based on data intelligence provided in an embodiment of the present invention. As shown in fig. 5, in this application scenario, first, medical data is acquired using a blockchain network (e.g., C as illustrated in fig. 5); the acquired medical data is then input into a server (e.g., S as illustrated in fig. 5) deployed with a data-intelligent-based medical traceability algorithm, wherein the server is capable of processing the medical data based on the data-intelligent medical traceability algorithm to determine whether a problem exists with the medical production process.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (7)

1. Medicine traceability system based on data intelligence, characterized by comprising:
The medical data acquisition module is used for acquiring medical data by utilizing a block chain network;
the natural language processing module is used for carrying out natural language processing based on the deep convolutional neural network on the medical data so as to obtain a medical check feature matrix; and
The traceability result generation module is used for determining whether a problem exists in the medicine production process based on the medicine verification feature matrix;
wherein, the natural language processing module includes:
the preprocessing unit is used for carrying out data preprocessing on the medical data to obtain a sequence of production process descriptors and a sequence of medical production specification flow descriptors;
The semantic analysis and understanding unit is used for carrying out embedded coding and semantic understanding on the sequence of the production process descriptor and the sequence of the medical production standard flow descriptor respectively so as to obtain a production process semantic understanding feature vector and a medical production standard flow semantic understanding feature vector; and
The semantic association unit is used for carrying out association coding on the semantic understanding feature vector in the production process and the semantic understanding feature vector in the medical production standard flow so as to obtain the medical check feature matrix;
wherein the semantic association unit includes:
The associated coding subunit is used for carrying out associated coding on the semantic understanding feature vector in the production process and the semantic understanding feature vector in the medical production standard flow so as to obtain an initial medical check feature matrix;
The multisource information fusion pre-verification distribution evaluation optimization subunit is used for carrying out multisource information fusion pre-verification distribution evaluation optimization on each row of feature vectors in the initial medicine verification feature matrix so as to obtain a plurality of optimized post-row feature vectors; and
An arrangement subunit, configured to arrange the plurality of optimized post-row feature vectors into the pharmaceutical check feature matrix;
the multisource information fusion pre-verification distribution evaluation optimization subunit is used for: carrying out multisource information fusion pre-verification distribution evaluation optimization on each line feature vector in the initial medicine verification feature matrix by utilizing an optimization formula so as to obtain a plurality of optimized line feature vectors;
Wherein, the optimization formula is:
wherein V j is the j-th row feature vector in the initial pharmaceutical check feature matrix, Is a mean value feature vector, n is a neighborhood setting hyper-parameter, log represents a logarithmic function based on 2,/>Representing per-position subtraction, V i is the i-th optimized post-row feature vector of the plurality of optimized post-row feature vectors.
2. The data intelligence based medical traceback system of claim 1, wherein the medical data includes medical production process data and medical production specification flow.
3. The data intelligence-based medical traceability system according to claim 2, wherein the preprocessing unit is configured to:
a production process description processing subunit, configured to perform data cleaning and word segmentation processing on the pharmaceutical production process data to obtain a sequence of the production process description words; and
And the medical production standard flow description subunit is used for carrying out data cleaning and word segmentation processing on the medical production standard flow to obtain a sequence of the medical production standard flow description words.
4. A data intelligence based medical traceback system according to claim 3 and wherein said semantic analysis and understanding unit comprises:
A production process semantic understanding subunit, configured to pass the sequence of production process descriptors through a first natural language processing model that includes a word embedding layer to obtain the production process semantic understanding feature vector;
And the standard flow semantic understanding subunit is used for enabling the sequence of the medical production standard flow description words to pass through a second natural language processing model containing a word embedding layer to obtain the medical production standard flow semantic understanding feature vector.
5. The data intelligence based medical traceback system of claim 4, wherein the production process semantic understanding subunit is configured to: mapping each production process descriptor in the sequence of production process descriptors to a word vector using the word embedding layer of the first natural language processing model including the word embedding layer to obtain the production process semantic understanding feature vector.
6. The data intelligence based medical traceability system of claim 5, wherein said traceability result generation module is configured to: and the medical verification feature matrix passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether a problem exists in the medical production process.
7. A data intelligence based medical tracing method for the data intelligence based medical tracing system of claim 1, comprising:
acquiring medical data by using a block chain network;
Performing natural language processing based on a deep convolutional neural network on the medical data to obtain a medical verification feature matrix; and
Based on the medicine verification feature matrix, whether a problem exists in the medicine production process is determined.
CN202310815511.8A 2023-07-04 2023-07-04 Medicine traceability system and method based on data intelligence Active CN116777295B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310815511.8A CN116777295B (en) 2023-07-04 2023-07-04 Medicine traceability system and method based on data intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310815511.8A CN116777295B (en) 2023-07-04 2023-07-04 Medicine traceability system and method based on data intelligence

Publications (2)

Publication Number Publication Date
CN116777295A CN116777295A (en) 2023-09-19
CN116777295B true CN116777295B (en) 2024-06-14

Family

ID=88013234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310815511.8A Active CN116777295B (en) 2023-07-04 2023-07-04 Medicine traceability system and method based on data intelligence

Country Status (1)

Country Link
CN (1) CN116777295B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220506A (en) * 2017-06-05 2017-09-29 东华大学 Breast cancer risk assessment analysis system based on deep convolutional neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883149B (en) * 2021-01-20 2024-03-26 华为技术有限公司 Natural language processing method and device
CN114579739B (en) * 2022-01-12 2023-05-30 中国电子科技集团公司第十研究所 Topic detection and tracking method for text data stream
CN115222566A (en) * 2022-08-02 2022-10-21 吴若涵 Learning method and system for international finance and finance metrology teaching

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220506A (en) * 2017-06-05 2017-09-29 东华大学 Breast cancer risk assessment analysis system based on deep convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于区块链技术的药品追溯体系构建研究;曹允春;李彤;林浩楠;;科技管理研究;20200820(第16期);全文 *

Also Published As

Publication number Publication date
CN116777295A (en) 2023-09-19

Similar Documents

Publication Publication Date Title
US11610678B2 (en) Medical diagnostic aid and method
CN108447534A (en) A kind of electronic health record data quality management method based on NLP
Lyu et al. A multimodal transformer: Fusing clinical notes with structured EHR data for interpretable in-hospital mortality prediction
CN112434535B (en) Element extraction method, device, equipment and storage medium based on multiple models
Ji et al. A novel deep learning approach for anomaly detection of time series data
CN113383316B (en) Method and apparatus for learning program semantics
CN113711236A (en) Data-driven deep learning model generalization analysis and improvement
US11763945B2 (en) System and method for labeling medical data to generate labeled training data
Pence et al. Data-theoretic approach for socio-technical risk analysis: Text mining licensee event reports of US nuclear power plants
CN116245107B (en) Electric power audit text entity identification method, device, equipment and storage medium
US11182411B2 (en) Combined data driven and knowledge driven analytics
CN116383399A (en) Event public opinion risk prediction method and system
Hillebrand et al. Towards automating numerical consistency checks in financial reports
Pham et al. Transsentlog: Interpretable anomaly detection using transformer and sentiment analysis on individual log event
CN117316462A (en) Medical data management method
CN116777295B (en) Medicine traceability system and method based on data intelligence
Grignani Using machine learning and Bayesian networks to objectively analyze central bank statements and market sentiment
Airlangga et al. Investigating Software Domain Impact in Requirements Quality Attributes Prediction.
Li et al. A deep learning approach of financial distress recognition combining text
CN114610882A (en) Abnormal equipment code detection method and system based on electric power short text classification
CN113191160A (en) Emotion analysis method for knowledge perception
CN118365459B (en) Intelligent matching system, method, equipment and medium for business insurance claim rules
Zeng et al. Distributed representation of patients and its use for medical cost prediction
Tong et al. Boosting Commit Classification with Contrastive Learning
CN118171645B (en) Business information analysis method and system based on text classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant