CN112434508A - Research report automatic generation method based on deep learning - Google Patents
Research report automatic generation method based on deep learning Download PDFInfo
- Publication number
- CN112434508A CN112434508A CN202011441359.4A CN202011441359A CN112434508A CN 112434508 A CN112434508 A CN 112434508A CN 202011441359 A CN202011441359 A CN 202011441359A CN 112434508 A CN112434508 A CN 112434508A
- Authority
- CN
- China
- Prior art keywords
- data
- filled
- research
- data item
- name
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000011160 research Methods 0.000 title claims abstract description 53
- 238000013135 deep learning Methods 0.000 title claims abstract description 18
- 238000000034 method Methods 0.000 title claims abstract description 15
- 238000004364 calculation method Methods 0.000 claims abstract description 32
- 238000003062 neural network model Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims description 20
- 230000009467 reduction Effects 0.000 claims description 17
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 description 4
- 238000013500 data storage Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Character Input (AREA)
Abstract
The invention provides a research report automatic generation method based on deep learning, which comprises the following steps: s1, acquiring a template of a research report to be generated and corresponding research data; s2, acquiring the items to be filled in the template; s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled; s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result; s5, filling the calculation result into the item to be filled; and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report. Compared with the prior art, the research report of the application is generated faster, and the accuracy is correspondingly higher, because the manual calculation mode can avoid the occurrence of the input error when the input error happens, and the error can be avoided to a great extent by directly obtaining the input error from the database.
Description
Technical Field
The invention relates to the field of report generation, in particular to a research report automatic generation method based on deep learning.
Background
At present, various research reports are used in various industries, and for the same industry, the framework of the research reports is almost similar, mainly the difference of data and the difference of data calculation results. Various types of research reports are in increasing demand, and at present, the required research reports can not be obtained quickly only by manually analyzing the research data and then writing the research reports.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide an automatic generation method of a study report based on deep learning.
The invention provides a research report automatic generation method based on deep learning, which comprises the following steps:
s1, acquiring a template of a research report to be generated and corresponding research data;
s2, acquiring the items to be filled in the template;
s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled;
s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result;
s5, filling the calculation result into the item to be filled;
and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report.
Preferably, the research data includes a name of a data item and a specific numerical value of the data item, and the research data storage is input into the database by scanning input, and specifically includes:
scanning a paper file recording the research data to obtain a scanned image;
performing character recognition on the scanned image to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item;
and transmitting the name of the data item and the specific numerical value of the data item to the database for storage.
Preferably, the obtaining of the corresponding calculation data from the research data comprises:
the items to be filled comprise names and filling areas of the items to be filled;
and matching the name of the item to be filled with the name of the data item, and taking the specific numerical value of the data item corresponding to the name of the successfully matched data item as calculation data.
Preferably, the filling the calculation result into the item to be filled includes: and filling the calculation result into the filling area.
Preferably, the performing character recognition on the scanned image to obtain the name of the data item recorded on the paper document and the specific numerical value of the data item includes:
carrying out graying processing on the scanned image to obtain a grayscale image;
carrying out noise reduction processing on the gray level image to obtain a noise reduction image;
carrying out segmentation processing on the noise reduction image to obtain a foreground image only containing a character part;
and performing character recognition on the foreground image by adopting an ORC character recognition technology so as to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item.
Compared with the prior art, the invention has the advantages that:
compared with the manual writing of research reports, the research report of the application has the advantages that the generation speed is higher, the accuracy is correspondingly higher, because the manual calculation mode can avoid the input error, and the error can be avoided to a great extent by directly obtaining the data from the database. Moreover, the manual calculation cost is high, the efficiency is slow, and the required research report cannot be obtained quickly.
Drawings
The invention is further illustrated by means of the attached drawings, but the embodiments in the drawings do not constitute any limitation to the invention, and for a person skilled in the art, other drawings can be obtained on the basis of the following drawings without inventive effort.
Fig. 1 is a diagram illustrating an exemplary embodiment of a method for automatically generating a research report based on deep learning according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
Referring to fig. 1, the present invention provides an automatic generation method of a research report based on deep learning, which includes:
s1, acquiring a template of a research report to be generated and corresponding research data;
s2, acquiring the items to be filled in the template;
s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled;
s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result;
s5, filling the calculation result into the item to be filled;
and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report.
The template of the research report adopts a universal template in the industry, namely the structure of the template is basically consistent, and only the numerical values required to be filled in different items in the template are different.
Research data are collected by researchers, and the researchers can carry out preliminary screening on the research data and remove error data which obviously exceed a normal range so as to enhance the accuracy of generation of subsequent research reports.
Preferably, the research data includes a name of a data item and a specific numerical value of the data item, and the research data storage is input into the database by scanning input, and specifically includes:
scanning a paper file recording the research data to obtain a scanned image;
performing character recognition on the scanned image to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item;
and transmitting the name of the data item and the specific numerical value of the data item to the database for storage.
Besides the scan input, the method can also be a manual input mode, and the mode is directed at the paper document which is too sloppy to be written and can not be scanned by the character recognition technology. Of course, if the research data is electronic data, the researcher can directly enter the electronic data into the database, much more quickly.
Preferably, the obtaining of the corresponding calculation data from the research data comprises:
the items to be filled comprise names and filling areas of the items to be filled;
and matching the name of the item to be filled with the name of the data item, and taking the specific numerical value of the data item corresponding to the name of the successfully matched data item as calculation data.
In addition, an automatic matching module can be provided, the input value of which is the name of the item to be filled in and the output value of which is the name of one or more data items. Researchers can adjust the modules according to actual needs, so that the adaptability of the automatic generation research report is improved.
Preferably, the filling the calculation result into the item to be filled includes: and filling the calculation result into the filling area.
Preferably, the performing character recognition on the scanned image to obtain the name of the data item recorded on the paper document and the specific numerical value of the data item includes:
carrying out graying processing on the scanned image to obtain a grayscale image;
carrying out noise reduction processing on the gray level image to obtain a noise reduction image;
carrying out segmentation processing on the noise reduction image to obtain a foreground image only containing a character part;
and performing character recognition on the foreground image by adopting an ORC character recognition technology so as to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item.
Preferably, the graying the scanned image to obtain a grayscale image includes:
acquiring a brightness component L of the scanned image in a Lab color model,
processing the luminance component as follows:
wherein (x, y) represents a coordinate, L (x, y) represents a luminance component value of a pixel having the coordinate (x, y), aL (x, y) represents a luminance component value after processing of a pixel having the coordinate (x, y), aL represents a luminance component after processing, a1+a2=1,a1And a2Representing a preset weight parameter, aveL (x, y) representing the average value of the luminance components of the pixels in the neighborhood with the size of k × k of the pixel with the coordinate of (x, y), maL (x, y) representing the minimum value of the luminance components of the pixels in the neighborhood, neiL representing the average value of the luminance components of all the pixels in the scanned image in the Lab color model, delta representing a control coefficient for controlling L (x, y) to be in a reasonable value range,
converting the aL back to an RGB color model, thereby obtaining an adjusted scanned image;
and carrying out graying processing on the adjusted scanning image to obtain a grayscale image.
The uneven brightness condition easily occurs in the scanning process, and according to the embodiment of the application, the brightness of different pixel points is adaptively adjusted, so that the accurate brightness adjustment can be performed on the currently processed pixel points according to the specific conditions of the pixel points around the currently processed pixel points, the dim light part is enhanced, more detail information can be favorably reserved for subsequent gray level images, and the accuracy of character recognition is improved.
Preferably, the performing noise reduction processing on the grayscale image to obtain a noise-reduced image includes:
carrying out noise point detection on pixel points in the gray level image to determine noise points;
recording the noise point as c, calculating a standard deviation fc (c) of gradients of pixel points in a neighborhood nei (c) with the size of e × e of the noise point c, and if fc (c) is smaller than a preset noise reduction threshold, performing noise reduction processing on the noise point c by adopting the following mode:
wherein, ano (c) represents the pixel value of the noise c after processing, f (g) represents the pixel value of the pixel g in nei (c), and numofei represents the total number of the pixels in nei (c);
if fc (c) is greater than or equal to a preset noise reduction threshold value, performing noise reduction processing on the noise point c by using the following formula:
where no (c) represents the pixel value of noise c, ano (c) represents the pixel value of noise c after processing, nei (c) represents the set of pixels in the neighborhood of noise c of e × e size, gs (d) mod d, mod represents the template for performing gaussian filtering processing on the grayscale image, x represents the convolution sign, d represents the element in nei (c), aveno (c) represents the average value of the pixel values of all the elements in nei (c), td (c) represents the standard deviation of the gradient amplitudes of all the pixels in nei (c), fc c) represents the standard deviation of the pixels in nei (c), di represents the spatial distance between d and c, and f (d) represents the pixel value of d.
The degree of difference of the pixel points in the nei (c) can be preliminarily judged by calculating the standard deviation of the gradient, if the degree of difference is smaller, the average value of the pixel values in the nei (c) is adopted to replace the pixel value of the c, and therefore the noise reduction result is achieved. When the distinguishing degree of the pixel points in nei (c) is larger, the surrounding situation of the noise point is more complicated, therefore, the relation between the pixel points around the noise point and the pixel points in the aspects of gradient amplitude, space distance and the like is fully considered, different weight proportions can be given to different pixel points in nei (c), so that accurate noise reduction is realized, and a Gaussian filter template is used as a weight value, so that a better noise reduction effect is achieved on the Gaussian noise point.
Preferably, the noise point detection is performed on the pixel points in the gray image, and determining the noise point includes:
marking the pixel point currently being processed as i, and for the pixel point i, marking the set of the pixel points in the neighborhood of r multiplied by r as Ui;
Calculate UiThe absolute value of the difference of the gradient amplitudes of each pixel point and the pixel point i:
abv (i, j) | td (i) -td (j) |, where j represents UiOne pixel point in (1); abv (i, j) represents the absolute value of the difference in gradient magnitude of i and j; td (i) and td (j) respectively represent the gradient magnitudes of the pixel points i and j,
and selecting the largest first ma absolute values for summation to obtain a judgment parameter:
wherein ad (i) represents a judgment parameter of i, UmaRepresenting the set of the pixel points corresponding to the maximum front ma absolute values; b represents UmaTd (b) represents the gradient amplitude of the pixel point b;
comparing ad (i) with a preset noise judgment threshold, if the ad (i) is greater than the noise judgment threshold, taking the ad (i) as the noise, otherwise, not taking the ad (i) as the noise.
In the conventional noise point detection, the average value of the gray values in the neighborhood is calculated and compared with the currently processed pixel point, so that whether the pixel point is a noise point is determined. According to the processing mode, when a plurality of noise points exist in the neighborhood, the average value is large, the noise points are easy to miss detection, and the problem can be well avoided by detecting the gradient amplitude.
While embodiments of the invention have been shown and described, it will be understood by those skilled in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims (5)
1. A research report automatic generation method based on deep learning is characterized by comprising the following steps:
s1, acquiring a template of a research report to be generated and corresponding research data;
s2, acquiring the items to be filled in the template;
s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled;
s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result;
s5, filling the calculation result into the item to be filled;
and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report.
2. The method according to claim 1, wherein the research data includes a name of a data item and a specific numerical value of the data item, and the research data is stored and input into the database by scanning input, and specifically includes:
scanning a paper file recording the research data to obtain a scanned image;
performing character recognition on the scanned image to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item;
and transmitting the name of the data item and the specific numerical value of the data item to the database for storage.
3. The method for automatically generating a research report based on deep learning according to claim 2, wherein the obtaining of corresponding calculation data from the research data comprises:
the items to be filled comprise names and filling areas of the items to be filled;
and matching the name of the item to be filled with the name of the data item, and taking the specific numerical value of the data item corresponding to the name of the successfully matched data item as calculation data.
4. The method as claimed in claim 3, wherein the filling of the calculation result into the item to be filled comprises: and filling the calculation result into the filling area.
5. The method of claim 3, wherein the performing text recognition on the scanned image to obtain the name of the data item recorded on the paper document and the specific value of the data item comprises:
carrying out graying processing on the scanned image to obtain a grayscale image;
carrying out noise reduction processing on the gray level image to obtain a noise reduction image;
carrying out segmentation processing on the noise reduction image to obtain a foreground image only containing a character part;
and performing character recognition on the foreground image by adopting an ORC character recognition technology so as to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011441359.4A CN112434508B (en) | 2020-12-10 | 2020-12-10 | Research report automatic generation method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011441359.4A CN112434508B (en) | 2020-12-10 | 2020-12-10 | Research report automatic generation method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112434508A true CN112434508A (en) | 2021-03-02 |
CN112434508B CN112434508B (en) | 2022-02-01 |
Family
ID=74691542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011441359.4A Active CN112434508B (en) | 2020-12-10 | 2020-12-10 | Research report automatic generation method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112434508B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108121966A (en) * | 2017-12-21 | 2018-06-05 | 欧浦智网股份有限公司 | A kind of list method for automatically inputting, electronic equipment and storage medium based on OCR technique |
CN109446345A (en) * | 2018-09-26 | 2019-03-08 | 深圳中广核工程设计有限公司 | Nuclear power file verification processing method and system |
CN109800397A (en) * | 2017-11-16 | 2019-05-24 | 北大方正集团有限公司 | Data analysis report automatic generation method, device, computer equipment and medium |
US20190189263A1 (en) * | 2017-12-15 | 2019-06-20 | International Business Machines Corporation | Automated report generation based on cognitive classification of medical images |
CN110503403A (en) * | 2019-08-27 | 2019-11-26 | 陕西蓝图司法鉴定中心 | Analysis and identification intelligent automation system and method for degree of injury |
CN111986182A (en) * | 2020-08-25 | 2020-11-24 | 卫宁健康科技集团股份有限公司 | Auxiliary diagnosis method, system, electronic device and storage medium |
-
2020
- 2020-12-10 CN CN202011441359.4A patent/CN112434508B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800397A (en) * | 2017-11-16 | 2019-05-24 | 北大方正集团有限公司 | Data analysis report automatic generation method, device, computer equipment and medium |
US20190189263A1 (en) * | 2017-12-15 | 2019-06-20 | International Business Machines Corporation | Automated report generation based on cognitive classification of medical images |
CN108121966A (en) * | 2017-12-21 | 2018-06-05 | 欧浦智网股份有限公司 | A kind of list method for automatically inputting, electronic equipment and storage medium based on OCR technique |
CN109446345A (en) * | 2018-09-26 | 2019-03-08 | 深圳中广核工程设计有限公司 | Nuclear power file verification processing method and system |
CN110503403A (en) * | 2019-08-27 | 2019-11-26 | 陕西蓝图司法鉴定中心 | Analysis and identification intelligent automation system and method for degree of injury |
CN111986182A (en) * | 2020-08-25 | 2020-11-24 | 卫宁健康科技集团股份有限公司 | Auxiliary diagnosis method, system, electronic device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112434508B (en) | 2022-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112819772B (en) | High-precision rapid pattern detection and recognition method | |
CN101710425B (en) | Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram | |
EP3309703B1 (en) | Method and system for decoding qr code based on weighted average grey method | |
CN108108746B (en) | License plate character recognition method based on Caffe deep learning framework | |
KR101795823B1 (en) | Text enhancement of a textual image undergoing optical character recognition | |
JP4680622B2 (en) | Classification device | |
CN106446952B (en) | A kind of musical score image recognition methods and device | |
CN111179243A (en) | Small-size chip crack detection method and system based on computer vision | |
JP5810628B2 (en) | Image processing apparatus and image processing program | |
CN104463795A (en) | Processing method and device for dot matrix type data matrix (DM) two-dimension code images | |
JP2010262648A5 (en) | Method for automatic alignment of document objects | |
KR101597739B1 (en) | Image processing apparatus, image processing method, and computer readable medium | |
CN111353961A (en) | Document curved surface correction method and device | |
CN117456195A (en) | Abnormal image identification method and system based on depth fusion | |
CN113284158B (en) | Image edge extraction method and system based on structural constraint clustering | |
CN115424107B (en) | Surface disease detection method of underwater bridge piers based on image fusion and deep learning | |
CN103530625A (en) | Optical character recognition method based on digital image processing | |
CN115880566A (en) | Intelligent marking system based on visual analysis | |
KR20120132315A (en) | Image processing apparatus, image processing method, and computer readable medium | |
CN112561875A (en) | Photovoltaic cell panel coarse grid detection method based on artificial intelligence | |
CN112434508B (en) | Research report automatic generation method based on deep learning | |
CN117315670B (en) | Water meter reading area detection method based on computer vision | |
JPH11306325A (en) | Object detection apparatus and object detection method | |
CN111161291A (en) | Contour detection method based on target depth of field information | |
Boiangiu et al. | Methods of bitonal image conversion for modern and classic documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |