CN113782135A - Method and system for analyzing and constructing case report table - Google Patents

Method and system for analyzing and constructing case report table Download PDF

Info

Publication number
CN113782135A
CN113782135A CN202110931246.0A CN202110931246A CN113782135A CN 113782135 A CN113782135 A CN 113782135A CN 202110931246 A CN202110931246 A CN 202110931246A CN 113782135 A CN113782135 A CN 113782135A
Authority
CN
China
Prior art keywords
information
file
format
target
case report
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110931246.0A
Other languages
Chinese (zh)
Inventor
文天才
张兴平
王斌
吕晓颖
王鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute Of Information On Traditional Chinese Medicine Cacms
Original Assignee
Institute Of Information On Traditional Chinese Medicine Cacms
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute Of Information On Traditional Chinese Medicine Cacms filed Critical Institute Of Information On Traditional Chinese Medicine Cacms
Priority to CN202110931246.0A priority Critical patent/CN113782135A/en
Publication of CN113782135A publication Critical patent/CN113782135A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of medical big data, in particular to a method and a system for analyzing and constructing a case report table, which comprises the following steps: after receiving a document file in a target format, converting the document file in the target format into a file in a zip format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document. Extracting first target information from the document.xml file, the header.xml file and the footer.xml file respectively to form a non-format text mark file and a non-format header and footer mark file. The invention can solve the defect of inconvenient analysis and construction of the case report table in the prior art.

Description

Method and system for analyzing and constructing case report table
Technical Field
The invention relates to the technical field of medical big data, in particular to a method and a system for constructing a case report table by analysis.
Background
The case report form in clinical studies is usually divided into a number of modules, each of which is a topic, and each topic is composed of a number of questions. The different modules are combined according to the time sequence specified by clinical research, and finally a set of tabular files capable of recording the information of the testees is formed.
Clinical case report forms are usually designed by researchers in the form of first draft, and questions, indexes or alternative answers to be collected during clinical research are designed into a Word file. Then the clinical research database builder redesigns the electronic questionnaire from the subjects in the Word file in a set of computer system. This design process is typically done by a combination of manual copying and manual editing, which is time consuming and prone to introduce new errors.
Disclosure of Invention
Therefore, the invention provides a method and a system for analyzing and constructing a case report table, aiming at solving the defect of inconvenient analysis and construction of the case report table in the prior art.
According to a first aspect of the present invention, there is provided a method of parsing a case report table, comprising the steps of: after receiving a document file in a target format, converting the document file in the target format into a file in a zip format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document. Extracting first target information from the document.xml file, the header.xml file and the footer.xml file respectively to form a non-format text mark file and a non-format header and footer mark file; extracting second target information from the document.xml file, and determining question stems and answer items in a case report table according to the first target information and the second target information; generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item; wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information.
Optionally, the extracting second target information from the document.xml file, and determining a question stem and an answer item in a case report table according to the first target information and the second target information includes: determining a first target mark information as an answer item, wherein the target mark information is one of the following items: boxes, underlines, and vertical lines; determining a sentence which starts with the second target mark information and ends with the third target mark information as a question stem; the second target mark information is numerical value mark information or character information, and the third target mark information is one of a colon mark, a question mark or a period mark.
Optionally, the method further includes: setting a variable name and variable format information for each determined question stem; and setting display attributes for each determined answer, wherein the display attributes are information such as a text box, a radio button, a multi-selection box, a pull-down list and the like.
Optionally, the method further includes: the generating of the case report sheet with the preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item comprises the following steps: and determining the identified stem as a text paragraph, and setting the display mode of the identified answer item as a text box, a single-choice button, a multi-choice box or a drop-down list.
Optionally, the method further includes: the generated case report is displayed.
Optionally, the method further includes: and if the displayed case report sheet does not meet the preset requirement, receiving adjustment information, and adjusting the positions of the question stems, the answer items, the positions of the question stems or the positions of the answer items in the pathological report sheet according to the adjustment information.
According to a second aspect of the present invention, there is provided a system for parsing a constructed case report table, comprising: the receiving module is used for converting the document file in the target format into a file in a zip format after receiving the document file in the target format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document. A first extraction module, configured to extract first target information from the document.xml file, the header.xml file, and the footer.xml file, respectively, to form a plain text markup file and a plain header and footer markup file; the second extraction module is used for extracting second target information from the document.xml file and determining a question stem and an answer item in a case report table according to the first target information and the second target information; the generating module is used for generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item; wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information.
According to a third aspect of the present invention, there is provided a computer device comprising: a memory and a processor, the memory and the processor being communicatively coupled to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of parsing a case report table.
According to a fourth aspect of the present invention, there is provided a computer-readable storage medium having stored thereon computer instructions for causing the computer to execute the method of parsing a constructed case report table.
The technical scheme of the invention has the following advantages:
1. the invention provides a method for analyzing and constructing a case report table, which comprises the following steps: after receiving a document file in a target format, converting the document file in the target format into a file in a zip format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document. Extracting first target information from the document.xml file, the header.xml file and the footer.xml file respectively to form a non-format text mark file and a non-format header and footer mark file; extracting second target information from the document.xml file, and determining question stems and answer items in a case report table according to the first target information and the second target information; generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item; wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information. Through the arrangement, when an operator needs to analyze and construct a case report table, a document file in a target format is received, the document file is converted into a file in a zip format, the file in the zip format is decompressed to obtain a plurality of files in an xml format, first target information is extracted from the file in the xml format to form a text label file without a format and a header and footer label file without a format, second target information is extracted from the xml file, a question stem and an answer item in the case report table are determined according to the first target information and the second target information, and a case report sheet in a preset format is generated according to the text label file without a format, the header and the answer item without a format, so that the operator can conveniently and quickly assemble and disassemble and construct the case report table.
2. The invention provides a method for analyzing and constructing a case report table, which extracts second target information from a document.xml file, determines a question stem and an answer item in the case report table according to the first target information and the second target information, and comprises the following steps: determining a first target mark information as an answer item, wherein the target mark information is one of the following items: boxes, underlines, and vertical lines; determining a sentence which starts with the second target mark information and ends with the third target mark information as a question stem; the second target mark information is numerical value mark information or character information, and the third target mark information is one of a colon mark, a question mark or a period mark. Through the arrangement, the answer items and the question stems are determined by utilizing the first target information, the second target information and the third target information, so that the method is convenient, quick and high in accuracy rate, and is not easy to make mistakes, and a case report table can be well analyzed and constructed.
3. The invention provides a method for analyzing and constructing a case report table, which further comprises the following steps: setting a variable name and variable format information for each determined question stem; and setting display attributes for each determined answer, wherein the display attributes are information such as a text box, a radio button, a multi-selection box, a pull-down list and the like. Through the setting, variable names and variable format information are set for each question stem, and display attributes are set for each answer, so that a case report table constructed through analysis is more visual, higher in readability and convenient for people to understand.
4. The invention provides a method for analyzing and constructing a case report table, which further comprises the following steps: the generating of the case report sheet with the preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item comprises the following steps: and determining the identified stem as a text paragraph, and setting the display mode of the identified answer item as a text box, a single-choice button, a multi-choice box or a drop-down list. Through the arrangement, the question stem is determined as a text paragraph, and the display mode of the answer item is set as a text box, a single-choice button, a multi-choice box or a drop-down list, so that the case report table can be analyzed and constructed more conveniently.
5. The invention provides a method for analyzing and constructing a case report table, which further comprises the following steps: the generated case report is displayed. Through the arrangement, the case report sheet can be displayed, so that the case report sheet is convenient for operators to read.
6. The invention provides a method for analyzing and constructing a case report table, which further comprises the following steps: and if the displayed case report sheet does not meet the preset requirement, receiving adjustment information, and adjusting the positions of the question stems, the answer items, the positions of the question stems or the positions of the answer items in the pathological report sheet according to the adjustment information. Through the arrangement, when the displayed case report sheet does not meet the preset requirements, the operator can adjust the case report sheet until the preset requirements are met, and the method is more reasonable and convenient.
7. The invention provides a system for analyzing and constructing a case report table, which comprises the following components: the receiving module is used for converting the document file in the target format into a file in a zip format after receiving the document file in the target format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document. A first extraction module, configured to extract first target information from the document.xml file, the header.xml file, and the footer.xml file, respectively, to form a plain text markup file and a plain header and footer markup file; the second extraction module is used for extracting second target information from the document.xml file and determining a question stem and an answer item in a case report table according to the first target information and the second target information; the generating module is used for generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item; wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information. Through the arrangement, when an operator needs to analyze and construct a case report table, a document file in a target format is received, the document file is converted into a file in a zip format, the file in the zip format is decompressed to obtain a plurality of files in an xml format, first target information is extracted from the file in the xml format to form a text label file without a format and a header and footer label file without a format, second target information is extracted from the xml file, a question stem and an answer item in the case report table are determined according to the first target information and the second target information, and a case report sheet in a preset format is generated according to the text label file without a format, the header and the answer item without a format, so that the operator can conveniently and quickly assemble and disassemble and construct the case report table.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a method of parsing a constructed case report table according to an embodiment of the present application;
fig. 2 is a flowchart illustrating a step S150 of a method for parsing a case report table according to an embodiment of the present application;
fig. 3 is a flowchart illustrating steps S161 and S162 of a method for parsing a case report table according to an embodiment of the present application;
fig. 4 is a flowchart illustrating steps S180 and S190 of a method for parsing a case report table according to an embodiment of the present application;
FIG. 5 is a flowchart of a method for parsing a case report table according to an embodiment of the present application;
FIG. 6a is a flowchart of a method for parsing a case report table according to an embodiment of the present application;
FIG. 6b is a flowchart of a method for parsing a case report table according to an embodiment of the present application;
FIG. 7 is a flowchart of a method for parsing a case report table according to an embodiment of the present application;
FIG. 8 is a block diagram of a system for parsing a build case report table according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a computer apparatus according to embodiment 3 of the present application;
description of reference numerals: 510. a receiving module; 520. a first extraction module; 530. a second extraction module; 540. a generation module; 610. a processor; 620 a memory; 630. a bus.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
Referring to fig. 1-2, the invention provides a method for analyzing and constructing a case report table, which is mainly used for analyzing the structure of an original Word file of the case report table and directly constructing the original Word file into an electronic questionnaire, and comprises the following steps:
s110, after receiving the document file in the target format, converting the document file in the target format into a file in a zip format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document.
S130, extracting the first target information from the document.xml file, the header.xml file, and the footer.xml file, respectively, to form a non-format text markup file and a non-format header and footer markup file.
Wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information, picture information.
And S150, extracting second target information from the document.
The second target information is frame information, underline information and vertical line information.
And S170, generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item.
Through the arrangement, when an operator needs to analyze and construct a case report table, a document file in a target format is received, the document file is converted into a file in a zip format, the file in the zip format is decompressed to obtain a plurality of files in an xml format, first target information is extracted from the file in the xml format to form a text label file without a format and a header and footer label file without a format, second target information is extracted from the xml file, a question stem and an answer item in the case report table are determined according to the first target information and the second target information, and a case report sheet in a preset format is generated according to the text label file without a format, the header and the answer item without a format, so that the operator can conveniently and quickly assemble and disassemble and construct the case report table.
Further, step S150 includes the steps of:
s1501, determining a first target mark information as an answer item, where the target mark information is one of the following: boxes, underlines and vertical lines.
S1502 specifies a term beginning with the second target mark information and ending with the third target mark information as a stem.
The second target mark information is numerical value mark information or character information, and the third target mark information is one of a colon mark, a question mark or a period mark.
And setting a variable name and variable format information for each determined question stem.
In addition, for a plurality of continuous paragraphs, all question stems which are closest to the front distance of the answer items are combined and marked in the system; for the table, the analysis module takes the head row or head column of the cell where all answer items are located as the subject stem, and a data manager can select the subject stem in the row or column; for the picture, the analysis model analyzes the number and the format of the picture.
Through the arrangement, the answer items and the question stems are determined by utilizing the first target information, the second target information and the third target information, so that the method is convenient, quick and high in accuracy rate, and is not easy to make mistakes, and a case report table can be well analyzed and constructed. In addition, variable names and variable format information are set for each question stem, and display attributes are set for each answer, so that a case report table constructed through analysis is more visual, higher in readability and convenient for people to understand.
Furthermore, in one embodiment, referring to fig. 3, the following steps are also performed after step S150:
s161, setting a variable name and variable format information for each determined topic stem.
And S162, setting a display attribute for each determined answer, wherein the display attribute is information such as a text box, a single-choice button, a multi-choice box, a pull-down list and the like.
Through the arrangement, the question stem is determined as a text paragraph, and the display mode of the answer item is set as a text box, a single-choice button, a multi-choice box or a drop-down list, so that the case report table can be analyzed and constructed more conveniently.
Furthermore, in one embodiment, referring to fig. 4, the following steps are also performed after step S170:
and S180, displaying the generated case report.
And S190, if the displayed case report sheet does not meet the preset requirement, receiving adjustment information, and adjusting the positions of the question stem, the answer item, the question stem or the answer item in the pathological report sheet according to the adjustment information.
Through the arrangement, the case report sheet can be displayed, so that the case report sheet is convenient for operators to read. In addition, when the displayed case report sheet does not meet the preset requirements, the operator can adjust the case report sheet until the preset requirements are met, and the method is more reasonable and convenient.
It should be noted that the data manager makes adjustments to the results of the system analysis, including adjustments to the stem or answer, and the table contains the stem in the first column or row.
The data manager sets a variable name and variable format information for each question stem.
And the data manager sets display attributes for each answer, wherein the display attributes comprise information such as a text box, a radio button, a multi-selection box, a pull-down list and the like.
And the data administrator submits the final confirmation result and reconstructs the electronic case report table.
One example is provided below:
(1) header file (refer to FIG. 5)
As can be seen from this example, the parsing module will extract the following tags from the bulk description information of the XML file
Form label: start tag < w: tbl >, end tag </w: tbl >
Table row label: start tag < w: tr >, end tag </w: tr >
Table column label: start tag < w: tc >, end tag </w: tc >
Paragraph labels: begin tag < w: p >, end tag </w: p >
Text content label: begin tag < w: t >, end tag </w: t >
Picture labeling: begin tag < w: drawing >, end tag < w: drawing >
Through the cyclic search of the key tags, the analysis model can analyze the table of the original case report table and the content in the table.
The analysis model analyzes the above contents, and can find that:
if the texts "topic number" and "21001001" are in the inner cell and are both label characters, the original text is returned.
The text "center number" and "□ □" are in the inner cell, but the latter is a common fill space descriptor in clinical study case reports tables, indicating that two characters, typically two digits, need to be filled in. The analysis module therefore records both the "centre number" and the "□ □" relationship and expresses them as a variable, the former representing the label of the variable and the latter representing the format of the variable, i.e. two digits are required to be entered.
The following variable analysis results were formed by the above analysis:
Figure BDA0003210911680000131
(2) content file
The content file is similar to the header file, and only part of the content of document. xml in the content file is given: (refer to FIGS. 6a and 6b)
Similarly, the analysis module analyzes various labels such as paragraphs, tables, rows, columns, texts, pictures and the like from the text description file, and forms the following variable definitions according to the original text:
the following variable analysis results were formed by the above analysis:
Figure BDA0003210911680000132
Figure BDA0003210911680000141
further, the analysis module intelligently recommends variable names for the variables by using the existing variable library:
Figure BDA0003210911680000151
Figure BDA0003210911680000161
after the data manager confirms, the construction module converts the result into a case report table layout description XML file again, and format information of the control related to each variable on the interface is described in detail in the file, wherein the format information comprises information such as name, control type, width, height, pixel far to the left, pixel far to the top, whether the control is visible or not, options, variable names and the like. The text of the "general data" paragraph and the "gender" variable on the current case report sheet are presented as examples below: (refer to FIG. 7)
And finally, converting the XML file described by the case report table into an HTML file by the construction module and displaying the HTML file as a webpage. Therefore, the automation of the whole construction process of the case report table is realized.
Example 2
Referring to fig. 3, the present invention provides a system for analyzing and constructing a case report table, which is mainly used for analyzing the original Word file structure of the case report table and directly constructing an electronic questionnaire, and comprises:
the receiving module 810 is configured to, after receiving a document file in a target format, convert the document file in the target format into a file in a zip format, and decompress the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document.
A first extracting module 820, configured to extract first target information from the document.xml file, the header.xml file, and the folder.xml file, respectively, to form a non-format text markup file and a non-format header and footer markup file;
a second extracting module 830, configured to extract second target information from the document.xml file, and determine a question stem and an answer item in a case report table according to the first target information and the second target information;
the generating module 840 is used for generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item;
wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information.
Through the arrangement, when an operator needs to analyze and construct a case report table, a document file in a target format is received, the document file is converted into a file in a zip format, the file in the zip format is decompressed to obtain a plurality of files in an xml format, first target information is extracted from the file in the xml format to form a text label file without a format and a header and footer label file without a format, second target information is extracted from the xml file, a question stem and an answer item in the case report table are determined according to the first target information and the second target information, and a case report sheet in a preset format is generated according to the text label file without a format, the header and the answer item without a format, so that the operator can conveniently and quickly assemble and disassemble and construct the case report table.
Example 3
Referring to fig. 4, an embodiment of the present invention further provides a computer device, as shown in the figure, the device includes a processor 910 and a memory 920, where the processor 910 and the memory 920 may be connected by a bus 930 or in other manners, and the connection by the bus 930 is taken as an example in the figure.
Processor 910 may be a Central Processing Unit 910 (CPU). The Processor 910 may also be other general-purpose processors 910, a Digital Signal Processor 910 (DSP), a Graphics Processor 910 (GPU), an embedded Neural network Processor 910 (NPU) or other dedicated deep learning coprocessor 910, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device (plc), a discrete Gate or transistor logic device, a discrete hardware component, or the like, or a combination thereof.
The memory 920 is a non-transitory computer readable storage medium, and can be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, and the processor 910 executes various functional applications and data processing of the processor 910 by running the non-transitory software programs, instructions, and modules stored in the memory 920, so as to implement the electronic consent method in the above method embodiments.
The memory 920 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 910, and the like. Additionally, the memory 920 can include high-speed random access memory 920, and can also include non-transitory memory 920, such as at least one piece of disk memory 920, flash memory device, or other piece of non-transitory solid state memory 920. In some embodiments, the memory 920 may optionally include memory 920 located remotely from the processor 910, and such remote memory 920 may be coupled to the processor 910 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The details of the computer device can be understood by referring to the corresponding descriptions and effects in the embodiments shown in the figures, and are not described herein again.
An embodiment of the present invention further provides a non-transitory computer storage medium, where the computer storage medium stores computer-executable instructions, and the computer-executable instructions may execute the advertisement delivery method in any method embodiment described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory 920(Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also include a combination of memories 920 of the sort described above.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (9)

1. A method for analyzing and constructing a case report table is characterized by comprising the following steps:
after receiving a document file in a target format, converting the document file in the target format into a file in a zip format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document.
Extracting first target information from the document.xml file, the header.xml file and the footer.xml file respectively to form a non-format text mark file and a non-format header and footer mark file;
extracting second target information from the document.xml file, and determining question stems and answer items in a case report table according to the first target information and the second target information;
generating a case report sheet in a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item;
wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information.
2. The method according to claim 2, wherein the extracting second object information from the document.xml file, and determining the stem and answer items in the case report table according to the first object information and the second object information comprises:
determining a first target mark information as an answer item, wherein the target mark information is one of the following items: boxes, underlines, and vertical lines;
determining a sentence which starts with the second target mark information and ends with the third target mark information as a question stem; the second target mark information is numerical value mark information or character information, and the third target mark information is one of a colon mark, a question mark or a period mark.
3. The method of claim 2, further comprising:
setting a variable name and variable format information for each determined question stem;
and setting display attributes for each determined answer, wherein the display attributes are information such as a text box, a radio button, a multi-selection box, a pull-down list and the like.
4. The method according to claim 2 or 3,
the generating of the case report sheet with the preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item comprises the following steps:
and determining the identified stem as a text paragraph, and setting the display mode of the identified answer item as a text box, a single-choice button, a multi-choice box or a drop-down list.
5. The method of claim 1, further comprising:
the generated case report is displayed.
6. The method of claim 5, further comprising:
and if the displayed case report sheet does not meet the preset requirement, receiving adjustment information, and adjusting the positions of the question stems, the answer items, the positions of the question stems or the positions of the answer items in the pathological report sheet according to the adjustment information.
7. A system for parsing a constructed case report table, comprising:
the receiving module (810) is used for converting the document file in the target format into a file in a zip format after receiving the document file in the target format, and decompressing the generated file in the zip format to obtain a plurality of files in an xml format; the files in the xml format comprise document.
A first extraction module (820) for extracting first target information from the document.xml file, the header.xml file and the footer.xml file, respectively, to form a plain text markup file and a plain header and footer markup file;
a second extraction module (830) for extracting second target information from the document.xml file, and determining a question stem and an answer item in a case report table according to the first target information and the second target information;
a generating module (840) for generating a case report sheet with a preset format according to the unformatted text mark file, the unformatted header and footer mark file, the question stem and the answer item;
wherein the first target information is at least one of the following: paragraph information, table information, line information, column information, cell information and picture information, wherein the second target information is square frame information, underline information and vertical line information.
8. A computer device, comprising: a memory (920) and a processor (910), the memory (920) and the processor (910) being communicatively connected to each other, the memory (920) having stored therein computer instructions, the processor (910) executing the computer instructions to perform the method of parsing a case report table according to any one of claims 1-6.
9. A computer-readable storage medium storing computer instructions for causing a computer to perform the method of parsing a constructed case report table according to any one of claims 1-6.
CN202110931246.0A 2021-08-13 2021-08-13 Method and system for analyzing and constructing case report table Pending CN113782135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110931246.0A CN113782135A (en) 2021-08-13 2021-08-13 Method and system for analyzing and constructing case report table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110931246.0A CN113782135A (en) 2021-08-13 2021-08-13 Method and system for analyzing and constructing case report table

Publications (1)

Publication Number Publication Date
CN113782135A true CN113782135A (en) 2021-12-10

Family

ID=78837624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110931246.0A Pending CN113782135A (en) 2021-08-13 2021-08-13 Method and system for analyzing and constructing case report table

Country Status (1)

Country Link
CN (1) CN113782135A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104464422A (en) * 2013-09-12 2015-03-25 郑州学生宝电子科技有限公司 Interactive teaching method based on information engineering and system thereof
CN104679726A (en) * 2013-12-03 2015-06-03 北大方正集团有限公司 Type setting method and device of word files
WO2016015564A1 (en) * 2014-07-31 2016-02-04 广州金山网络科技有限公司 Method and apparatus for displaying document
CN106354740A (en) * 2016-05-04 2017-01-25 上海秦镜网络科技有限公司 Electronic examination paper inputting method
CN111027286A (en) * 2019-12-20 2020-04-17 暨南大学 Electronic questionnaire generation method
CN111062187A (en) * 2019-11-27 2020-04-24 北京计算机技术及应用研究所 Structured parsing method and system for docx format document
CN111104557A (en) * 2019-11-22 2020-05-05 黄琴 Heterogeneous document processing system and method based on standard document markup language specification
CN112507666A (en) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 Document conversion method and device, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104464422A (en) * 2013-09-12 2015-03-25 郑州学生宝电子科技有限公司 Interactive teaching method based on information engineering and system thereof
CN104679726A (en) * 2013-12-03 2015-06-03 北大方正集团有限公司 Type setting method and device of word files
WO2016015564A1 (en) * 2014-07-31 2016-02-04 广州金山网络科技有限公司 Method and apparatus for displaying document
CN106354740A (en) * 2016-05-04 2017-01-25 上海秦镜网络科技有限公司 Electronic examination paper inputting method
CN111104557A (en) * 2019-11-22 2020-05-05 黄琴 Heterogeneous document processing system and method based on standard document markup language specification
CN111062187A (en) * 2019-11-27 2020-04-24 北京计算机技术及应用研究所 Structured parsing method and system for docx format document
CN111027286A (en) * 2019-12-20 2020-04-17 暨南大学 Electronic questionnaire generation method
CN112507666A (en) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 Document conversion method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11244208B2 (en) Two-dimensional document processing
EP3999929A1 (en) Systems and methods for populating a structured database based on an image representation of a data table
CN110147541B (en) Method and device for generating economic report
CN101025738B (en) Template-free dynamic website generating method
WO2019153685A1 (en) Text processing method, apparatus, computer device and storage medium
CN112434691A (en) HS code matching and displaying method and system based on intelligent analysis and identification and storage medium
CN101361059A (en) System and method supporting displaying content on portable apparatus
CN104199871A (en) High-speed test question inputting method for intelligent teaching
CN107545460B (en) Digital color page promotion management and analysis method, storage device and mobile terminal
CN114238575A (en) Document parsing method, system, computer device and computer-readable storage medium
US11392753B2 (en) Navigating unstructured documents using structured documents including information extracted from unstructured documents
Schwabish The practice of visual data communication: what works
KR102457962B1 (en) Method and apparatus for extracting metadata of thesis
US20160321247A1 (en) Gender and name translation from a first to a second language
CN113782135A (en) Method and system for analyzing and constructing case report table
JP6621514B1 (en) Summary creation device, summary creation method, and program
US20230177359A1 (en) Method and apparatus for training document information extraction model, and method and apparatus for extracting document information
US11281901B2 (en) Document extraction system and method
CN113627189A (en) Entity identification information extraction, storage and display method for insurance clauses
CN113821555A (en) Unstructured data collection processing method of intelligent supervision black box
Thomas et al. A framework for corpus-based analysis of the graphic signalling of discourse structure
CN113065316A (en) Method for dynamically converting formal thumbnail file into html (hypertext markup language) and inputting question bank, selecting questions from question bank and composing draft and generating thumbnail file
CN114637505A (en) Page content extraction method and device
CN115048910A (en) Method for converting document into webpage format
NL2031543B1 (en) Method and device for processing image data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination