CN112434508A

CN112434508A - Research report automatic generation method based on deep learning

Info

Publication number: CN112434508A
Application number: CN202011441359.4A
Authority: CN
Inventors: 黄冬虹; 刘谢慧; 赵彤
Original assignee: Qingyan Lingzhi Information Consulting Beijing Co ltd
Current assignee: Qingyan Lingzhi Information Consulting Beijing Co ltd
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2021-03-02
Anticipated expiration: 2040-12-10
Also published as: CN112434508B

Abstract

The invention provides a research report automatic generation method based on deep learning, which comprises the following steps: s1, acquiring a template of a research report to be generated and corresponding research data; s2, acquiring the items to be filled in the template; s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled; s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result; s5, filling the calculation result into the item to be filled; and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report. Compared with the prior art, the research report of the application is generated faster, and the accuracy is correspondingly higher, because the manual calculation mode can avoid the occurrence of the input error when the input error happens, and the error can be avoided to a great extent by directly obtaining the input error from the database.

Description

Research report automatic generation method based on deep learning

Technical Field

The invention relates to the field of report generation, in particular to a research report automatic generation method based on deep learning.

Background

At present, various research reports are used in various industries, and for the same industry, the framework of the research reports is almost similar, mainly the difference of data and the difference of data calculation results. Various types of research reports are in increasing demand, and at present, the required research reports can not be obtained quickly only by manually analyzing the research data and then writing the research reports.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide an automatic generation method of a study report based on deep learning.

The invention provides a research report automatic generation method based on deep learning, which comprises the following steps:

s1, acquiring a template of a research report to be generated and corresponding research data;

s2, acquiring the items to be filled in the template;

s3, acquiring a corresponding deep learning neural network model and acquiring corresponding calculation data from the research data based on the project to be filled;

s4, inputting the calculation data into the deep learning neural network model to obtain a calculation result;

s5, filling the calculation result into the item to be filled;

and S6, repeating the steps S2-S5 until all the items to be filled are filled, thereby obtaining a research report.

Preferably, the research data includes a name of a data item and a specific numerical value of the data item, and the research data storage is input into the database by scanning input, and specifically includes:

scanning a paper file recording the research data to obtain a scanned image;

performing character recognition on the scanned image to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item;

and transmitting the name of the data item and the specific numerical value of the data item to the database for storage.

Preferably, the obtaining of the corresponding calculation data from the research data comprises:

the items to be filled comprise names and filling areas of the items to be filled;

and matching the name of the item to be filled with the name of the data item, and taking the specific numerical value of the data item corresponding to the name of the successfully matched data item as calculation data.

Preferably, the filling the calculation result into the item to be filled includes: and filling the calculation result into the filling area.

Preferably, the performing character recognition on the scanned image to obtain the name of the data item recorded on the paper document and the specific numerical value of the data item includes:

carrying out graying processing on the scanned image to obtain a grayscale image;

carrying out noise reduction processing on the gray level image to obtain a noise reduction image;

carrying out segmentation processing on the noise reduction image to obtain a foreground image only containing a character part;

and performing character recognition on the foreground image by adopting an ORC character recognition technology so as to obtain the name of the data item recorded on the paper file and the specific numerical value of the data item.

Compared with the prior art, the invention has the advantages that:

compared with the manual writing of research reports, the research report of the application has the advantages that the generation speed is higher, the accuracy is correspondingly higher, because the manual calculation mode can avoid the input error, and the error can be avoided to a great extent by directly obtaining the data from the database. Moreover, the manual calculation cost is high, the efficiency is slow, and the required research report cannot be obtained quickly.

Drawings

The invention is further illustrated by means of the attached drawings, but the embodiments in the drawings do not constitute any limitation to the invention, and for a person skilled in the art, other drawings can be obtained on the basis of the following drawings without inventive effort.

Fig. 1 is a diagram illustrating an exemplary embodiment of a method for automatically generating a research report based on deep learning according to the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

Referring to fig. 1, the present invention provides an automatic generation method of a research report based on deep learning, which includes:

s2, acquiring the items to be filled in the template;

s5, filling the calculation result into the item to be filled;

The template of the research report adopts a universal template in the industry, namely the structure of the template is basically consistent, and only the numerical values required to be filled in different items in the template are different.

Research data are collected by researchers, and the researchers can carry out preliminary screening on the research data and remove error data which obviously exceed a normal range so as to enhance the accuracy of generation of subsequent research reports.

scanning a paper file recording the research data to obtain a scanned image;

Besides the scan input, the method can also be a manual input mode, and the mode is directed at the paper document which is too sloppy to be written and can not be scanned by the character recognition technology. Of course, if the research data is electronic data, the researcher can directly enter the electronic data into the database, much more quickly.

In addition, an automatic matching module can be provided, the input value of which is the name of the item to be filled in and the output value of which is the name of one or more data items. Researchers can adjust the modules according to actual needs, so that the adaptability of the automatic generation research report is improved.

Preferably, the graying the scanned image to obtain a grayscale image includes:

acquiring a brightness component L of the scanned image in a Lab color model,

processing the luminance component as follows:

wherein (x, y) represents a coordinate, L (x, y) represents a luminance component value of a pixel having the coordinate (x, y), aL (x, y) represents a luminance component value after processing of a pixel having the coordinate (x, y), aL represents a luminance component after processing, a₁+a₂＝1，a₁And a₂Representing a preset weight parameter, aveL (x, y) representing the average value of the luminance components of the pixels in the neighborhood with the size of k × k of the pixel with the coordinate of (x, y), maL (x, y) representing the minimum value of the luminance components of the pixels in the neighborhood, neiL representing the average value of the luminance components of all the pixels in the scanned image in the Lab color model, delta representing a control coefficient for controlling L (x, y) to be in a reasonable value range,

converting the aL back to an RGB color model, thereby obtaining an adjusted scanned image;

and carrying out graying processing on the adjusted scanning image to obtain a grayscale image.

The uneven brightness condition easily occurs in the scanning process, and according to the embodiment of the application, the brightness of different pixel points is adaptively adjusted, so that the accurate brightness adjustment can be performed on the currently processed pixel points according to the specific conditions of the pixel points around the currently processed pixel points, the dim light part is enhanced, more detail information can be favorably reserved for subsequent gray level images, and the accuracy of character recognition is improved.

Preferably, the performing noise reduction processing on the grayscale image to obtain a noise-reduced image includes:

carrying out noise point detection on pixel points in the gray level image to determine noise points;

recording the noise point as c, calculating a standard deviation fc (c) of gradients of pixel points in a neighborhood nei (c) with the size of e × e of the noise point c, and if fc (c) is smaller than a preset noise reduction threshold, performing noise reduction processing on the noise point c by adopting the following mode:

wherein, ano (c) represents the pixel value of the noise c after processing, f (g) represents the pixel value of the pixel g in nei (c), and numofei represents the total number of the pixels in nei (c);

if fc (c) is greater than or equal to a preset noise reduction threshold value, performing noise reduction processing on the noise point c by using the following formula:

where no (c) represents the pixel value of noise c, ano (c) represents the pixel value of noise c after processing, nei (c) represents the set of pixels in the neighborhood of noise c of e × e size, gs (d) mod d, mod represents the template for performing gaussian filtering processing on the grayscale image, x represents the convolution sign, d represents the element in nei (c), aveno (c) represents the average value of the pixel values of all the elements in nei (c), td (c) represents the standard deviation of the gradient amplitudes of all the pixels in nei (c), fc c) represents the standard deviation of the pixels in nei (c), di represents the spatial distance between d and c, and f (d) represents the pixel value of d.

The degree of difference of the pixel points in the nei (c) can be preliminarily judged by calculating the standard deviation of the gradient, if the degree of difference is smaller, the average value of the pixel values in the nei (c) is adopted to replace the pixel value of the c, and therefore the noise reduction result is achieved. When the distinguishing degree of the pixel points in nei (c) is larger, the surrounding situation of the noise point is more complicated, therefore, the relation between the pixel points around the noise point and the pixel points in the aspects of gradient amplitude, space distance and the like is fully considered, different weight proportions can be given to different pixel points in nei (c), so that accurate noise reduction is realized, and a Gaussian filter template is used as a weight value, so that a better noise reduction effect is achieved on the Gaussian noise point.

Preferably, the noise point detection is performed on the pixel points in the gray image, and determining the noise point includes:

marking the pixel point currently being processed as i, and for the pixel point i, marking the set of the pixel points in the neighborhood of r multiplied by r as U_i；

Calculate U_iThe absolute value of the difference of the gradient amplitudes of each pixel point and the pixel point i:

abv (i, j) | td (i) -td (j) |, where j represents U_iOne pixel point in (1); abv (i, j) represents the absolute value of the difference in gradient magnitude of i and j; td (i) and td (j) respectively represent the gradient magnitudes of the pixel points i and j,

and selecting the largest first ma absolute values for summation to obtain a judgment parameter:

wherein ad (i) represents a judgment parameter of i, U_maRepresenting the set of the pixel points corresponding to the maximum front ma absolute values; b represents U_maTd (b) represents the gradient amplitude of the pixel point b;

comparing ad (i) with a preset noise judgment threshold, if the ad (i) is greater than the noise judgment threshold, taking the ad (i) as the noise, otherwise, not taking the ad (i) as the noise.

In the conventional noise point detection, the average value of the gray values in the neighborhood is calculated and compared with the currently processed pixel point, so that whether the pixel point is a noise point is determined. According to the processing mode, when a plurality of noise points exist in the neighborhood, the average value is large, the noise points are easy to miss detection, and the problem can be well avoided by detecting the gradient amplitude.

While embodiments of the invention have been shown and described, it will be understood by those skilled in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A research report automatic generation method based on deep learning is characterized by comprising the following steps:

s2, acquiring the items to be filled in the template;

s5, filling the calculation result into the item to be filled;

2. The method according to claim 1, wherein the research data includes a name of a data item and a specific numerical value of the data item, and the research data is stored and input into the database by scanning input, and specifically includes:

scanning a paper file recording the research data to obtain a scanned image;

3. The method for automatically generating a research report based on deep learning according to claim 2, wherein the obtaining of corresponding calculation data from the research data comprises:

4. The method as claimed in claim 3, wherein the filling of the calculation result into the item to be filled comprises: and filling the calculation result into the filling area.

5. The method of claim 3, wherein the performing text recognition on the scanned image to obtain the name of the data item recorded on the paper document and the specific value of the data item comprises: