CN116246761A - Scheduling policy determination method, device and equipment for sampling resources and storage medium - Google Patents

Scheduling policy determination method, device and equipment for sampling resources and storage medium Download PDF

Info

Publication number
CN116246761A
CN116246761A CN202310124037.4A CN202310124037A CN116246761A CN 116246761 A CN116246761 A CN 116246761A CN 202310124037 A CN202310124037 A CN 202310124037A CN 116246761 A CN116246761 A CN 116246761A
Authority
CN
China
Prior art keywords
entity
sampling
biological sampling
biological
record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310124037.4A
Other languages
Chinese (zh)
Inventor
赫甲帅
左嘉琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202310124037.4A priority Critical patent/CN116246761A/en
Publication of CN116246761A publication Critical patent/CN116246761A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for determining a scheduling strategy of sampling resources, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period; inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the entities of the target class on the first biological sampling record; constructing a data map based on the entity relation triplet and the first biological sampling record; based on the data map, predicting a scheduling strategy of biological sampling resources in a preset time period in the future. The invention can provide data support for biological sampling scheduling, effectively reduce overflow waste and improve the utilization rate of resources.

Description

Scheduling policy determination method, device and equipment for sampling resources and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for determining a scheduling policy of a sampling resource.
Background
When citizens take biological samples, such as throat swab samples, the most recent medical institution needs to be searched for samples, and when the number of patients in the medical institution is large, a great deal of time resources are consumed for taking biological samples in a queuing mode, so that most citizens prefer to take biological samples in a self-sampling mode and obtain detection results.
At present, for the dispatching of biological self-collection resources, corresponding test tubes are generally provided according to the number of people in the community, manual dispatching is carried out by the personnel in the community, and self-inspection is carried out in a one-person-one-tube mode. And the biological sampling resources are scheduled in an overflow mode exceeding the demand, so that a great deal of manpower resources and material resources are wasted.
Disclosure of Invention
The invention provides a method, a device, equipment and a storage medium for determining a scheduling strategy of sampling resources, which are used for solving the defect that a large amount of manpower resources and material resources are wasted in an overflow mode in the prior art, realizing the purpose of providing data support for biological sampling scheduling, effectively reducing the overflow waste and improving the utilization rate of the resources.
The invention provides a scheduling policy determining method of sampling resources, which comprises the following steps:
acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
Inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
constructing a data map based on the entity relationship triplet and the first biological sampling record;
and predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
According to the method for determining the scheduling policy of the sampling resources provided by the invention, the scheduling policy of the biological sampling resources in a preset time period in the future is predicted based on the data map, and the method comprises the following steps:
determining first biological sampling records in different first historical time periods corresponding to non-target categories based on the data map;
comparing the first biological sampling records in different first historical time periods to determine first deviation between the first biological sampling records in different historical time periods;
and predicting a scheduling strategy of the biological sampling resources in a future preset time period based on the first deviation and the first biological sampling record.
According to the method for determining the scheduling policy of the sampling resources provided by the invention, the scheduling policy of the biological sampling resources in a preset time period in the future is predicted based on the data map, and the method comprises the following steps:
acquiring predicted biological sampling resources within at least two second historical time periods;
determining a second biological sample resource in a second biological sample record over each of the second historical time periods based on the data map;
comparing the predicted biosampling resource with a second biosampling resource to determine a deviation dataset;
determining a bias threshold based on the bias dataset;
and predicting a scheduling strategy of biological sampling resources in a preset future time period based on the deviation threshold value and the first biological sampling record.
According to the method for determining the scheduling policy of the sampling resource provided by the invention, the step of inputting the first biological sampling record into the data analysis scheduling model to determine the entity relationship triplet comprises the following steps:
inputting the first biological sampling record into an entity identification model in a data analysis scheduling model, and outputting an entity data set corresponding to the first biological sampling record;
inputting the entity data set into an entity classification model in the data analysis scheduling model, and determining a target entity data set corresponding to a target category and a target first biological sampling record corresponding to the target entity data set;
And inputting the target entity data set and the target first biological sampling record corresponding to the target entity data set into a relation extraction model in the data analysis scheduling model, and determining an entity relation triplet corresponding to the target first biological sampling record.
According to the method for determining the scheduling policy of the sampling resource provided by the invention, the determining the entity relationship triplet corresponding to the target first biological sampling record comprises the following steps:
determining at least one pair of target entity pairs without replacement based on the target entity dataset;
determining a target relationship between each pair of target entity pairs based on the target first biological sample record;
and determining each entity relation triplet corresponding to the target first biological sampling record based on each target entity pair and the target relation corresponding to the target entity pair.
According to the method for determining the scheduling policy of the sampling resource provided by the invention, before the non-replacement determination of at least one pair of target entity pairs, the method further comprises the following steps:
and performing deduplication operation on the target entities in the target entity data set.
According to the scheduling policy determining method for sampling resources provided by the invention, the method further comprises the following steps:
Determining an influence factor corresponding to the first deviation based on the data map;
and visually displaying the influence factors.
The invention also provides a scheduling policy determining device for sampling resources, comprising:
the acquisition module is used for acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
the determining module is used for inputting the first biological sampling record into a data analysis scheduling model, determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
the construction module is used for constructing a data map based on the entity relation triplet and the first biological sampling record;
and the prediction module is used for predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the scheduling policy determining method of the sampling resource when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a scheduling policy determination method for sampling resources as described in any of the above.
According to the method, the device, the equipment and the storage medium for determining the scheduling strategy of the sampling resources, the entity identification and the entity classification are carried out on the first biological sampling record in the first historical time period through the data analysis scheduling model, the relation among the entities is determined through the relation extraction of the classified target type entities, the data map is constructed through the entity relation triplet and the first biological sampling record, the scheduling strategy of the biological resources in the preset time period in the future is predicted through the data map, so that the biological sampling resources are scheduled based on the scheduling strategy, the accuracy and the reliability of prediction are improved, meanwhile, the overflow waste of the biological sampling resources is effectively reduced, meanwhile, the data resource prediction scheduling strategy is fully utilized, the manpower resources are saved, and the intelligent degree of resource scheduling is improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for determining a scheduling policy of a sampling resource provided by the invention;
FIG. 2 is an exemplary line graph corresponding to a deviation dataset provided by the present invention;
FIG. 3 is an exemplary schematic diagram of a data map provided by the present invention;
fig. 4 is a schematic structural diagram of a scheduling policy determining apparatus for sampling resources provided by the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The dispatching of biological self-collection resources generally provides test tubes with corresponding quantity according to the number of people in the community in a fully manual mode, and uniformly carries out self-inspection according to a one-person-one-tube mode, the number of self-collection people in the community fluctuates, and the detection reagent has a valid period and is influenced by factors such as air, temperature, light, impurities and the like, and the dispatching mode can cause overflow waste of sampling resources and human resource waste. In view of the above problems, an embodiment of the present invention provides a method for determining a scheduling policy of a sampling resource, and fig. 1 is a schematic flow chart of the method for determining the scheduling policy of the sampling resource, as shown in fig. 1, where the method includes:
Step 110, a first biological sampling record in a first historical time period is obtained, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period.
Alternatively, the first biosampled record may be structured data or unstructured data stored in a database. Taking biological sampling as a throat swab self-sampling example, the first biological sampling resource is the number of test tubes or the test tube identification, and the related information of the sampling object can be the name of sampling personnel, the number of personnel, the area, the community, the sampling type and the like.
For example, taking biological sampling as throat swab self-sampling, structured data as a table, unstructured data as text information as an example, in the case that the data stored in the database is structured data, the first biological sampling record may be obtained by summarizing discrete biological sampling records after discrete throat swab self-sampling in the database, and the structured discrete biological sampling records are shown in table 1, where each record includes but is not limited to: the region, community, sampling type, test tube number, test tube identification, personnel number and name are summarized according to the region, community, sampling type and the like on the basis of determining the structured discrete biological sampling record, and the structured first biological sampling record shown in the table 2 is obtained.
Table 1 structured discrete biological sample records
Region of Community (community) Sampling type Number of test tubes Test tube identification Personnel count Name of name
Area A A cell Throat swab 1 Test tube A 1 Zhang San
Area B B cell Throat swab 1 Test tube C 1 Zheng Shi
Area A A cell Throat swab 1 Test tube B 1 Zhou Jiu
Area A A cell Throat swab 1 Test tube D 1 Li Si
Area B B cell Throat swab 1 Test tube I 1 Feng Er
Area A A cell Throat swab 1 Test tube E 1 Zhao Liu
Area A A cell Throat swab 1 Test tube F 1 Radix seu herba Desmodii Styracifolii
Area A A cell Throat swab 1 Test tube G 1 Wang Wu
Area A A cell Throat swab 1 Test tube H 1 Sun Ba
Table 2 structured first biosampled recording
Figure BDA0004081265170000071
In the case where the data stored in the database is text information, the first biosampling record may be: the area A adopts the throat swab to sample test tubes, wherein, the test tube number of actual sampling is 7, and used test tubes are test tube A, test tube B, test tube D, test tube G, test tube E, test tube F and test tube H respectively, and the throat swab sample of a personnel is preserved in every test tube, however the throat swab test tube number of distributing to area A is greater than 7 far away, under the condition that detection reagent validity period is shorter, leads to the waste of a large amount of sampling resources.
Alternatively, the first historical time period may be a historical time period corresponding to at least one day from a future preset time period to be predicted, taking the future preset time period as an open day as an example, and the first historical time period may be today, yesterday, previous day, and previous N days, where N is an integer greater than 0.
And 120, inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object.
Specifically, after determining the first biological sampling record, performing entity identification on the first biological sampling record through a data analysis scheduling model, and determining that: regional, community, sampling type, number of tubes, tube identification, number of people, name, etc. When the relation extraction is carried out on the entities, the determined entities have larger data quantity due to more entity types, so that the time consumption for constructing the data map is increased. Therefore, in the embodiment of the present invention, after the entity is identified, the identified entity is classified into an open domain entity and a defined domain entity. An open domain entity may be an entity that is large in area, community, etc., and may be further divided, e.g., an area may be further divided into communities, house numbers, etc. The defined domain entity may be an entity whose information such as sampling type, number of test tubes, test tube identification, number of people, name, etc. is unique. And the entity of the limiting domain is used as the entity of the target class to carry out relation extraction, a plurality of entity relation triples are obtained, the construction efficiency of the data map is improved by reducing the data volume of entity pairs, the entity relation triples comprise two limiting domain entities and the relation between the two limiting domain entities, the two limiting domain entities are respectively test tube identifications and personnel names, and the relation between the two extracted limiting domain entities is used as a bearing relation, and the corresponding entity relation triples can be: < tube identification, bearing relationship, personnel name >.
And 130, constructing a data map based on the entity relation triplet and the first biological sampling record.
Specifically, after determining a plurality of entity relationship triples, the entities in the entity relationship triples are used as nodes, the relationships between the entity pairs are used as edges between the nodes, a local data map is constructed, and the open domain entities in the first biological sampling record are added into the local data map, so that a perfect data map is constructed.
Alternatively, the local data pattern and the perfect data pattern may be a mesh structure pattern or a line structure pattern.
And 140, predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
Specifically, after the data map is determined, knowledge reasoning can be performed on the data map, the scheduling strategy is adjusted on the basis of the first biological sampling record, prediction of the scheduling strategy of biological sampling resources in a preset time period in the future is achieved, scheduling is performed on the basis of the predicted scheduling strategy under the condition that prediction accuracy and reliability are ensured, overflow waste can be effectively reduced, the utilization rate of the biological sampling resources is improved, manpower resources can be saved, and the intelligent degree of resource scheduling is improved.
Optionally, the predicting, based on the data map, a scheduling policy of the biological sampling resource in a preset future period of time includes:
determining first biological sampling records in different first historical time periods corresponding to non-target categories based on the data map;
comparing the first biological sampling records in different first historical time periods to determine first deviation between the first biological sampling records in different historical time periods;
and predicting a scheduling strategy of the biological sampling resources in a future preset time period based on the first deviation and the first biological sampling record.
Specifically, after determining the data map, knowledge reasoning is performed on the multiple groups in the data map, data in first biological sampling records in different first historical time periods under non-target categories are determined, the data in the different first historical time periods are compared, first deviations in the different first historical time periods are determined, the first biological sampling records are adjusted based on the first deviations, and a scheduling strategy of biological sampling resources in a future preset time period is predicted.
For example, if the first historical time period is today and the scheduling policy of tomorrow needs to be predicted, the first biological sampling record of today and the first biological sampling record of yesterday are compared to determine that the interval is the first deviation of one day, and then the scheduling policy of the biological sampling resource of tomorrow is predicted through the first deviation and the first biological sampling record of today.
If the first historical time period is today and the scheduling strategy of the acquired day is to be predicted, the biological sampling resources in the first biological sampling record of the today and the biological sampling resources in the first biological sampling record of the previous day can be compared to determine a first deviation of which the interval is two days, and then the scheduling strategy of the acquired day is predicted through the first deviation and the first biological sampling record of the today. In addition, the first biological sampling record of today and the biological sampling resource of the first biological sampling of yesterday can be compared, the biological sampling resource in the first biological sampling record of yesterday and the biological sampling resource in the first biological sampling record of the previous day can be compared, and after two initial deviations are determined, the average value of the two initial deviations is taken as the first deviation. Any two biological sampling resources in the first biological sampling record of today, the first biological sampling of yesterday and the first biological sampling record of the previous day can be compared to determine a plurality of initial deviations, and the average value of the plurality of initial deviations is taken as the first deviation.
Optionally, the scheduling policy for predicting the biological sampling resource in the future preset time period based on the data map includes:
acquiring predicted biological sampling resources within at least two second historical time periods;
Determining a second biological sample resource in a second biological sample record over each of the second historical time periods based on the data map;
comparing the predicted biosampling resource with a second biosampling resource to determine a deviation dataset;
determining a bias threshold based on the bias dataset;
and predicting a scheduling strategy of biological sampling resources in a preset future time period based on the deviation threshold value and the first biological sampling record.
Specifically, in order to improve the prediction accuracy of the scheduling policy and reduce the error between the prediction scheduling policy and the actual acquisition policy, in the embodiment of the present invention, the predicted biological sampling resources in the scheduling policy in at least two second historical time periods may be obtained, and by performing indication reasoning on the data map, the second biological sampling resources in the second biological sampling records in the second historical time periods are obtained. And comparing the predicted biological sampling resources in the same time period with the second biological sampling resources to obtain predicted deviations in different time periods, constructing a deviation data set based on the predicted deviations, and further determining a deviation threshold value based on the deviation data set, so that when the scheduling strategy is predicted, after the initial scheduling strategy in a preset time period in the future is determined through the first biological sampling record, the initial scheduling strategy is adjusted through the deviation threshold value, and more accurate prediction is realized.
Alternatively, when the deviation threshold is determined based on the deviation data set, the average value of each predicted deviation may be determined and the average value may be determined as the deviation threshold. FIG. 2 is a line graph corresponding to the deviation data set provided by the present invention, as shown in FIG. 2, and each second historical time period is respectively: as is clear from fig. 2, the maximum fluctuation range of the prediction bias is 2, less than the error threshold, and the fluctuation of the prediction bias is relatively stable, the average value of the prediction bias is determined to be 8, and is a positive number, that is, the difference between the predicted biological sampling resource and the second biological sampling resource is 8. And further determining the deviation threshold value as 8, and subtracting 8 from the initial scheduling strategy after predicting the initial scheduling strategy to obtain the final predicted scheduling strategy of the biological sampling resource. If the fluctuation amplitude of the adjacent prediction deviation is larger than the error threshold value, the prediction deviation value can be removed when the deviation threshold value is determined, and the deviation threshold value is determined only by the average value of other deviation threshold values.
In addition, the prediction deviation change trend of the second historical time period can be simulated, the prediction deviation of the preset time period can be directly predicted, and after the initial scheduling strategy is predicted, the scheduling strategy of the finally predicted biological sampling resource can be further determined through the prediction deviation.
The unit of the second history period may be days, and the second history period may be the same as or different from the first history period.
Optionally, the inputting the first biological sampling record into a data analysis scheduling model, determining an entity relationship triplet includes:
inputting the first biological sampling record into an entity identification model in a data analysis scheduling model, and outputting an entity data set corresponding to the first biological sampling record;
inputting the entity data set into an entity classification model in the data analysis scheduling model, and determining a target entity data set corresponding to a target category and a target first biological sampling record corresponding to the target entity data set;
and inputting the target entity data set and the target first biological sampling record corresponding to the target entity data set into a relation extraction model in the data analysis scheduling model, and determining an entity relation triplet corresponding to the target first biological sampling record.
Specifically, after the first biological sampling record is determined, a named entity recognition technology is utilized, words in the first biological sampling record are converted into word vectors through an entity recognition model, and the entities in the first biological sampling record are obtained through dynamic splicing of context word vectors. Further, since the relation extraction is binary relation extraction, in order to reduce the number of entity pairs and improve the relation extraction efficiency, in the embodiment of the invention, after the entity is identified, the entity is classified into an open domain entity and a limited domain entity, after the limited domain entity is determined, a target first biological sampling record corresponding to the limited domain entity is determined from the first biological sampling record, and relation extraction is performed based on the limited domain entity and the target first biological sampling record corresponding to the limited domain entity, so as to obtain an entity relation triplet corresponding to the limited domain entity.
Optionally, in the data analysis scheduling model, the entity identification model may be a BERT-BiLSTM-CRF model, where:
the BERT ((Bidirectional Encoder Representation from Transformers)) model captures deep language features of context information through position information of word vectors in the BERT model after marking word vectors based on a BIO-labeling mode on the first biological sampling record. BiLSTM (Bi-directional Long Short-Term Memory) is a two-way long-short-Term Memory network, and is composed of a forward LSTM network and a backward LSTM network, and the context relationship is fully utilized by encoding the target first biological sampling record from front to back and from back to front, so that the context information of a long distance can be captured better. CRF (Condition Random Fields) is a conditional random field, and is connected to the BiLSTM model and then used for predicting and outputting an optimized sequence, so as to obtain an entity corresponding to the first biological sampling record.
In addition, in the embodiment of the invention, the entity categories are open domain entities and limited domain entities, so that the entity classification model can adopt a softmax classifier as a last layer activation function of the entity identification model to perform two classifications on the identified entities.
Alternatively, where the first biosampled record is structured data, the values of key-value pairs in the structured data may be mapped to entities in the first biosampled record.
Optionally, the determining the entity relationship triplet corresponding to the target first biological sampling record includes:
determining at least one pair of target entity pairs without replacement based on the target entity dataset;
determining a target relationship between each pair of target entity pairs based on the target first biological sample record;
and determining each entity relation triplet corresponding to the target first biological sampling record based on each target entity pair and the target relation corresponding to the target entity pair.
Specifically, after the entities are classified, for the target entity data set determined after classification, through the non-replaced determined target entity pair, it is ensured that two entities in the obtained same entity relationship triplet are different, and the number of subsequent correction of the entity relationship triplet is reduced. After determining the target entity pair, splicing the character strings corresponding to the target entity pair in the target first biological sampling record based on the target entity pair, further extracting the relation of the spliced character strings, and combining the extracted relation with the corresponding target entity pair to obtain the entity relation triplet.
Optionally, a relationship extraction model may be constructed based on the attention mechanism and the CNN network, to extract the relationship between the two entities in the spliced string.
Optionally, before the determining of the at least one target entity pair without replacement, the method further comprises:
and performing deduplication operation on the target entities in the target entity data set.
Specifically, in order to reduce the number of incorrect combinations of the target entity pairs, for example, if two entities in the determined target entity pair are the same, in the embodiment of the present invention, when the target entity data set is constructed, the deduplication operation is performed on the target entities in the target entity data set, and when the determined target entity pair is not replaced, it can be ensured that the two entities in the target entity pair are different.
Illustratively, fig. 3 is an exemplary schematic diagram of a data map provided by the present invention, where the defined domain entity determined in the first biosampling record includes: examples of the defined domain entity pairs and relationships determined by the relationship extraction model are shown in Table 3, with test tube A, zhang San, personnel count 7, test tube count 7, and throat swab as examples. Based on the defined domain entities and relationships shown in table 3, entity relationship triples may be further determined, such as: when the entity pair of the limiting domain is the test tube A and the pharyngeal swab respectively, the relation of extraction is the test tube sampling type, and the determined entity relation triplet is < test tube A, test tube sampling type and pharyngeal swab >. Another example is: when the entity pairs in the limiting domain are test tube A and Zhang San respectively, the extracted relationship is a bearing relationship, namely, the self-sampling throat swab sample bearing Zhang San in the test tube A, and the determined entity relationship triplet is < test tube A, bearing relationship and Zhang San >.
In addition, a local data map is constructed based on the defined domain entity pair and the relation, and after the non-target category is added to the local data map based on the first biological sampling record, a perfect data map shown in fig. 3 is obtained, wherein a solid line frame represents the defined domain entity, and a dotted line frame represents the open domain entity.
It should be noted that, the number of test tubes is equal to the number of people, if the number of test tubes is not equal to the number of people, the first biological sampling record is abnormal, and the cause of the abnormality may be: the summarization process is abnormal or the sampling record is abnormal.
Table 3 define domain entity pairs and relationship examples
Limiting domain entity pair examples Relationship example
Test tube A, personnel count 7 Containment relationship
Test tube A, zhang San Bearing relationship, self-sampling throat swab sample bearing Zhang III in test tube A
Test tube A, pharyngeal swab Test tube sampling type
Number of people 7, number of test tubes 7 A comparison relationship equal to
Zhang san, personnel count 7 Containment relationship
Test tube A, test tube number 7 Containment relationship
Zhang san, test tube count 7 Containment relationship
Optionally, the method further comprises:
determining an influence factor corresponding to the first deviation based on the data map;
and visually displaying the influence factors.
Specifically, after the first deviation is determined, an influence factor causing the first deviation can be further determined based on the data map, and the possible influence factor is visually displayed, so that after the worker obtains the influence factor, the worker can remind the corresponding region or the corresponding community in a mode such as telephone, mail, short message and the like, and the specific adjustment strategy can be adjusted.
According to the method for determining the scheduling policy of the sampling resources, provided by the invention, entity identification and entity classification are carried out on the first biological sampling record in the first historical time period through the data analysis scheduling model, relationship extraction is carried out on the classified target type entity, the relationship among the entities is determined, a data map is constructed through the entity relationship triplet and the first biological sampling record, the scheduling policy of the biological resources in the future preset time period is predicted through the data map, so that the biological sampling resources are scheduled based on the scheduling policy, the accuracy and reliability of prediction are improved, meanwhile, the overflow waste of the biological sampling resources is effectively reduced, meanwhile, the data resource prediction scheduling policy is fully utilized, the manpower resources are saved, and the intelligent degree of resource scheduling is improved.
The scheduling policy determining device for sampling resources provided by the invention is described below, and the scheduling policy determining device for sampling resources described below and the scheduling policy determining method for sampling resources described above can be referred to correspondingly.
The embodiment of the present invention further provides a device for determining a scheduling policy of a sampling resource, and fig. 4 is a schematic structural diagram of the device for determining a scheduling policy of a sampling resource provided by the present invention, as shown in fig. 4, where the device 400 for determining a scheduling policy of a sampling resource includes: an acquisition module 401, a determination module 402, a construction module 403, and a prediction module 404, wherein:
an obtaining module 401, configured to obtain a first biological sampling record in a first historical period, where the first biological sampling record includes related information of a first biological sampling resource and a sampling object in the first historical period;
a determining module 402, configured to input the first biological sampling record into a data analysis scheduling model, and determine an entity relationship triplet, where the data analysis scheduling model is configured to identify an entity, classify the entity, and extract a relationship between the entity and a target class of the first biological sampling record, and the entity is configured to characterize related information of the biological sampling resource and the sampling object;
A construction module 403, configured to construct a data map based on the entity relationship triplet and the first biological sample record;
a prediction module 404, configured to predict a scheduling policy of the biological sampling resource in a preset time period in the future based on the data map.
According to the sampling resource scheduling policy determining device provided by the invention, the entity identification and the entity classification are carried out on the first biological sampling record in the first historical time period through the data analysis scheduling model, the relation among the entities is determined through the relation extraction of the classified target type entity, the data map is constructed through the entity relation triplet and the first biological sampling record, the scheduling policy of the biological resource in the future preset time period is predicted through the data map, so that the biological sampling resource is scheduled based on the scheduling policy, the prediction accuracy and the reliability are improved, meanwhile, the overflow waste of the biological sampling resource is effectively reduced, meanwhile, the data resource prediction scheduling policy is fully utilized, the manpower resource is saved, and the intelligent degree of resource scheduling is improved.
Optionally, the prediction module 404 is specifically configured to:
determining first biological sampling records in different first historical time periods corresponding to non-target categories based on the data map;
Comparing the first biological sampling records in different first historical time periods to determine first deviation between the first biological sampling records in different historical time periods;
and predicting a scheduling strategy of the biological sampling resources in a future preset time period based on the first deviation and the first biological sampling record.
Optionally, the prediction module 404 is specifically configured to:
acquiring predicted biological sampling resources within at least two second historical time periods;
determining a second biological sample resource in a second biological sample record over each of the second historical time periods based on the data map;
comparing the predicted biosampling resource with a second biosampling resource to determine a deviation dataset;
determining a bias threshold based on the bias dataset;
and predicting a scheduling strategy of biological sampling resources in a preset future time period based on the deviation threshold value and the first biological sampling record.
Optionally, the determining module 402 is specifically configured to:
inputting the first biological sampling record into an entity identification model in a data analysis scheduling model, and outputting an entity data set corresponding to the first biological sampling record;
inputting the entity data set into an entity classification model in the data analysis scheduling model, and determining a target entity data set corresponding to a target category and a target first biological sampling record corresponding to the target entity data set;
And inputting the target entity data set and the target first biological sampling record corresponding to the target entity data set into a relation extraction model in the data analysis scheduling model, and determining an entity relation triplet corresponding to the target first biological sampling record.
Optionally, the determining module 402 is specifically configured to:
determining at least one pair of target entity pairs without replacement based on the target entity dataset;
determining a target relationship between each pair of target entity pairs based on the target first biological sample record;
and determining each entity relation triplet corresponding to the target first biological sampling record based on each target entity pair and the target relation corresponding to the target entity pair.
Optionally, the determining module 402 is specifically configured to:
before the unreplaced determining at least one pair of target entities, the method further comprises:
and performing deduplication operation on the target entities in the target entity data set.
Optionally, the scheduling policy determining apparatus for sampling resources further includes: the visualization module is specifically used for:
determining an influence factor corresponding to the first deviation based on the data map;
And visually displaying the influence factors.
Fig. 5 is a schematic structural diagram of an electronic device provided by the present invention, as shown in fig. 5, the electronic device may include: processor 510, communication interface (Communications Interface) 520, memory 530, and communication bus 540, wherein processor 510, communication interface 520, memory 530 complete communication with each other through communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a scheduling policy determination method for sampling resources, the method comprising:
acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
constructing a data map based on the entity relationship triplet and the first biological sampling record;
And predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
Further, the logic instructions in the memory 530 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can execute a method for determining a scheduling policy of a sampling resource provided by each method, and the method includes:
Acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
constructing a data map based on the entity relationship triplet and the first biological sampling record;
and predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method for determining a scheduling policy of sampling resources provided by the above methods, the method comprising:
acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
Inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
constructing a data map based on the entity relationship triplet and the first biological sampling record;
and predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for determining a scheduling policy for a sampling resource, comprising:
acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
inputting the first biological sampling record into a data analysis scheduling model, and determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
constructing a data map based on the entity relationship triplet and the first biological sampling record;
and predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
2. The method for determining a scheduling policy for sampling resources according to claim 1, wherein predicting the scheduling policy for the biological sampling resources within a preset time period in the future based on the data map comprises:
determining first biological sampling records in different first historical time periods corresponding to non-target categories based on the data map;
Comparing the first biological sampling records in different first historical time periods to determine first deviation between the first biological sampling records in different historical time periods;
and predicting a scheduling strategy of the biological sampling resources in a future preset time period based on the first deviation and the first biological sampling record.
3. The method according to claim 1, wherein predicting the scheduling policy of the biological sampling resource in the future preset time period based on the data map comprises:
acquiring predicted biological sampling resources within at least two second historical time periods;
determining a second biological sample resource in a second biological sample record over each of the second historical time periods based on the data map;
comparing the predicted biosampling resource with a second biosampling resource to determine a deviation dataset;
determining a bias threshold based on the bias dataset;
and predicting a scheduling strategy of biological sampling resources in a preset future time period based on the deviation threshold value and the first biological sampling record.
4. A method of determining a scheduling policy for a sampling resource according to any one of claims 1 to 3, wherein said inputting the first biosampled record into a data analysis scheduling model determines an entity relationship triplet, comprising:
Inputting the first biological sampling record into an entity identification model in a data analysis scheduling model, and outputting an entity data set corresponding to the first biological sampling record;
inputting the entity data set into an entity classification model in the data analysis scheduling model, and determining a target entity data set corresponding to a target category and a target first biological sampling record corresponding to the target entity data set;
and inputting the target entity data set and the target first biological sampling record corresponding to the target entity data set into a relation extraction model in the data analysis scheduling model, and determining an entity relation triplet corresponding to the target first biological sampling record.
5. The method for determining a scheduling policy for a sampling resource according to claim 4, wherein determining the entity relationship triplet corresponding to the target first biological sampling record includes:
determining at least one pair of target entity pairs without replacement based on the target entity dataset;
determining a target relationship between each pair of target entity pairs based on the target first biological sample record;
and determining each entity relation triplet corresponding to the target first biological sampling record based on each target entity pair and the target relation corresponding to the target entity pair.
6. The method of scheduling policy determination for a sampling resource according to claim 5, wherein prior to said determining at least one target entity pair without dropping, said method further comprises:
and performing deduplication operation on the target entities in the target entity data set.
7. The scheduling policy determination method of sampling resources according to claim 2, wherein the method further comprises:
determining an influence factor corresponding to the first deviation based on the data map;
and visually displaying the influence factors.
8. A scheduling policy determining apparatus for sampling resources, comprising:
the acquisition module is used for acquiring a first biological sampling record in a first historical time period, wherein the first biological sampling record comprises related information of a first biological sampling resource and a sampling object in the first historical time period;
the determining module is used for inputting the first biological sampling record into a data analysis scheduling model, determining an entity relation triplet, wherein the data analysis scheduling model is used for carrying out entity identification, entity classification and relation extraction on the first biological sampling record and the entity of a target class, and the entity is used for representing the biological sampling resource and the related information of the sampling object;
The construction module is used for constructing a data map based on the entity relation triplet and the first biological sampling record;
and the prediction module is used for predicting a scheduling strategy of the biological sampling resources in a preset time period in the future based on the data map.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the scheduling policy determination of sampling resources according to any one of claims 1 to 7 when the program is executed by the processor.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements a scheduling policy determination method of sampling resources according to any of claims 1 to 7.
CN202310124037.4A 2023-02-14 2023-02-14 Scheduling policy determination method, device and equipment for sampling resources and storage medium Pending CN116246761A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310124037.4A CN116246761A (en) 2023-02-14 2023-02-14 Scheduling policy determination method, device and equipment for sampling resources and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310124037.4A CN116246761A (en) 2023-02-14 2023-02-14 Scheduling policy determination method, device and equipment for sampling resources and storage medium

Publications (1)

Publication Number Publication Date
CN116246761A true CN116246761A (en) 2023-06-09

Family

ID=86632612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310124037.4A Pending CN116246761A (en) 2023-02-14 2023-02-14 Scheduling policy determination method, device and equipment for sampling resources and storage medium

Country Status (1)

Country Link
CN (1) CN116246761A (en)

Similar Documents

Publication Publication Date Title
CN109036577B (en) Diabetes complication analysis method and device
CN110347840B (en) Prediction method, system, equipment and storage medium for complaint text category
US20220044148A1 (en) Adapting prediction models
CN108717408A (en) A kind of sensitive word method for real-time monitoring, electronic equipment, storage medium and system
CN112036997B (en) Method and device for predicting abnormal users in taxpayers
CN116089873A (en) Model training method, data classification and classification method, device, equipment and medium
CN112084330A (en) Incremental relation extraction method based on course planning meta-learning
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN113705215A (en) Meta-learning-based large-scale multi-label text classification method
CN116862318B (en) New energy project evaluation method and device based on text semantic feature extraction
CN114090601A (en) Data screening method, device, equipment and storage medium
CN113360643A (en) Electronic medical record data quality evaluation method based on short text classification
CN103136440A (en) Method and device of data processing
CN116246761A (en) Scheduling policy determination method, device and equipment for sampling resources and storage medium
CN116306909A (en) Method for realizing model training, computer storage medium and terminal
CN111341404B (en) Electronic medical record data set analysis method and system based on ernie model
CN114840642A (en) Event extraction method, device, equipment and storage medium
CN107895251A (en) Data error-correcting method and device
CN113220992A (en) Information flow content recommendation method, system and medium
CN116187299B (en) Scientific and technological project text data verification and evaluation method, system and medium
CN111177465A (en) Method and device for determining category
CN113836244B (en) Sample acquisition method, model training method, relation prediction method and device
CN116244612B (en) HTTP traffic clustering method and device based on self-learning parameter measurement
CN115687334B (en) Data quality inspection method, device, equipment and storage medium
CN117236648B (en) Intelligent system for talent recruitment and matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination