CN115565193A - Questionnaire information input method and device, electronic equipment and storage medium - Google Patents

Questionnaire information input method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115565193A
CN115565193A CN202211279363.4A CN202211279363A CN115565193A CN 115565193 A CN115565193 A CN 115565193A CN 202211279363 A CN202211279363 A CN 202211279363A CN 115565193 A CN115565193 A CN 115565193A
Authority
CN
China
Prior art keywords
questionnaire
topic
picture
pictures
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211279363.4A
Other languages
Chinese (zh)
Inventor
廖瑞勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202211279363.4A priority Critical patent/CN115565193A/en
Publication of CN115565193A publication Critical patent/CN115565193A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • G06V30/19013Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Abstract

The embodiment of the application discloses a questionnaire information input method, a questionnaire information input device, electronic equipment and a storage medium. The method comprises the following steps: scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm; identifying questionnaire titles in the preprocessed questionnaire pictures, and acquiring questionnaire characteristic identification templates corresponding to the questionnaire titles; based on a questionnaire feature recognition template, performing question dimension segmentation on the preprocessed questionnaire picture to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures; for any topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model; writing back the recognized characters to obtain the input questionnaire information; and storing the questionnaire information into a database to complete the input of the questionnaire to be input. The questionnaire input method and device can effectively improve questionnaire input efficiency and accuracy.

Description

Questionnaire information input method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of information entry, in particular to a questionnaire information entry method and device, an electronic device and a storage medium.
Background
Currently, a survey, usually in the form of a questionnaire, is required before some significant decisions are made. The carrying medium of the questionnaire is different according to different survey objects, and for young people, the questionnaire is usually issued in a network link form; for the elderly who do not know the smart phone and the network, questionnaires are usually issued in the form of paper.
After the paper questionnaire is collected, in order to facilitate the subsequent statistics and analysis of the questionnaire answering content, the answering content of the questionnaire is usually recorded into a database and stored in a data form. At present, questionnaires answering contents are generally input in a manual input mode.
However, in the case of large-scale paper questionnaires, the manual input mode is inefficient, the input efficiency can only be improved by increasing the labor cost, and the accuracy of manual input is low, which may cause certain influence on statistics and analysis, resulting in distortion of the survey results.
Disclosure of Invention
The embodiment of the application provides a questionnaire information input method and device, electronic equipment and a storage medium, so as to improve the efficiency and accuracy of questionnaire input.
In a first aspect, an embodiment of the present application provides a questionnaire information entry method, where the method includes:
scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title;
based on the questionnaire feature identification template, performing topic dimension segmentation on the preprocessed questionnaire picture to obtain topic pictures of all topics in the questionnaire picture and topic element pictures in all the topic pictures;
for any one topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model;
writing back the identified characters according to the position relationship among characters in the theme element pictures, the position relationship among the theme element pictures in the theme picture and the position relationship among the theme pictures in the questionnaire picture to obtain the input questionnaire information;
and storing the questionnaire information into a database to complete the input of the questionnaire to be input.
In a second aspect, an embodiment of the present application further provides a questionnaire information entry device, where the questionnaire information entry device includes:
the scanning processing module is used for scanning the questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
the template acquisition module is used for identifying questionnaire titles in the preprocessed questionnaire pictures and acquiring questionnaire feature identification templates corresponding to the questionnaire titles;
the question segmentation module is used for segmenting the question dimensionality of the preprocessed questionnaire picture based on the questionnaire feature identification template to obtain a question picture of each question in the questionnaire picture and a question element picture in each question picture;
the character recognition module is used for recognizing characters in the topic element pictures by utilizing a pre-trained OCR character recognition model for any topic element picture;
the text write-back module is used for writing back the identified text according to the position relationship among the texts in the topic element pictures, the position relationship among the topic element pictures in the topic pictures and the position relationship among the topic pictures in the questionnaire pictures to obtain the input questionnaire information;
and the storage module is used for storing the questionnaire information into a database so as to complete the input of the questionnaire to be input.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a questionnaire information entry method as provided in any embodiment of the application.
In a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the questionnaire information entry method provided in any embodiment of the present application.
According to the technical scheme of the embodiment of the application, a questionnaire to be input is scanned to obtain a questionnaire picture of the questionnaire to be input, and the questionnaire picture is preprocessed according to a preprocessing algorithm; identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title; based on the questionnaire feature identification template, performing topic dimension segmentation on the preprocessed questionnaire picture to obtain topic pictures of all topics in the questionnaire picture and topic element pictures in all the topic pictures; for any one of the topic element pictures, recognizing characters in the topic element pictures by utilizing a pre-trained OCR character recognition model; writing back the identified characters according to the position relationship among characters in the subject element pictures, the position relationship among the subject element pictures in the subject pictures and the position relationship among the subject pictures in the questionnaire pictures to obtain the input questionnaire information; and storing the questionnaire information into a database to complete the input of the questionnaire to be input. Based on the method, the question of the questionnaire is cut by using the template, the characters in each question are identified by using the OCR character identification model, and finally the questionnaire information is written back to obtain.
Drawings
Fig. 1 is a schematic flowchart of a questionnaire information entry method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a topic feature template of a single topic provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a topic feature template of a textual topic according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a topic feature template of a matrix topic provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a questionnaire feature recognition template provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a questionnaire information entry device provided in the second embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a schematic flowchart of a questionnaire information entry method provided in an embodiment of the present application, and this embodiment is applicable to a questionnaire information entry scenario. The method can be executed by a questionnaire information entry device, which can be implemented in hardware and/or software, and can be generally integrated in an electronic device such as a computer with data operation capability, and specifically comprises the following steps:
step 101, scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm.
It should be noted that the scanning in this step may be performed by using a device with a scanner function to obtain a questionnaire picture, or may be performed by using an intelligent terminal device with a camera to take a picture to obtain a questionnaire picture, or may be performed by installing scanning software in the intelligent terminal device with a camera to take a picture of a questionnaire to be entered, and then processing the picture by using the scanning software to obtain a questionnaire picture with an approximate scanning effect.
In addition, when the preprocessing algorithm is used for processing, the boundary detection can be firstly carried out on the questionnaire picture, the questionnaire boundary of the questionnaire picture is identified, and the questionnaire picture is cut based on the questionnaire boundary; and then carrying out global noise reduction processing on the cut questionnaire picture, and carrying out binarization processing on the questionnaire picture after the global noise reduction processing.
The questionnaire picture obtained based on the operation only contains the contents in the questionnaire, does not contain invalid contents outside the questionnaire boundary, and is low in noise, black and white as a whole, high in contrast and convenient for subsequent identification.
And 102, identifying the questionnaire titles in the preprocessed questionnaire pictures, and acquiring questionnaire characteristic identification templates corresponding to the questionnaire titles.
In this step, the questionnaire titles refer to the characters in the heading area of the middle area of the questionnaire, and for identification, titles are usually set in the heading area of the questionnaire to distinguish different questionnaires. It should be noted that the position of the head-up area is usually fixed, and is generally the area position between the top end of the questionnaire and the position vertically distant from the top end of the questionnaire by the preset length.
Therefore, in this step, the head-up area picture of the questionnaire picture can be obtained according to the size information of the preprocessed questionnaire picture. Specifically, the size information of the questionnaire picture may be measured, and the size information may include a questionnaire length and a questionnaire width. Then, standard size information of the prestored questionnaire is obtained, the standard size information can also comprise standard length of the questionnaire and standard width of the questionnaire, the ratio of the standard length of the questionnaire to the length of the questionnaire and the ratio of the standard width of the questionnaire to the width of the questionnaire are obtained, and finally, the average value of the two ratios is taken as the expansion ratio of the questionnaire picture.
After the expansion ratio is obtained, multiplying the size information of the questionnaire picture by the expansion ratio to obtain the size of the questionnaire picture after expansion, then obtaining the preset vertical distance from the top end of the questionnaire, cutting the expanded questionnaire picture according to the vertical distance, and obtaining the head raising area picture between the top end of the questionnaire and the preset length position of the vertical distance from the top end of the questionnaire.
After the head-up area picture is obtained, the characters in the head-up area picture can be recognized by utilizing a pre-trained OCR character recognition model, and the questionnaire title in the questionnaire picture is obtained. Specifically, the characters in the new line region picture may be cut according to a gap between the characters, for example, the new line region picture is scanned from left to right, and for any two characters, a distance between a last inking point of a previous character and a first inking point of a next character is greater than or equal to a preset threshold, it may be considered that the two characters are in front and behind, and the cutting may be performed between the last inking point of the previous character and the first inking point of the next character.
It should be noted that after the questionnaire picture is subjected to binarization processing, the background is usually white, the text is usually in a color with a gray level different from 0, such as black, dark gray, light gray, and the like, the questionnaire picture is usually composed of pixel points, and each text is also composed of pixel points with a gray level different from 0, so that each pixel point of a text can be an inking point.
Since the characters are usually written from left to right, when scanning a character from left to right, one or more inked points are always scanned first, and when scanning the last one or more inked points are always scanned last, and then the characters are white until the next character appears, the gap between the characters can be identified by the method, and then the characters are cut along the gap to cut the characters.
And after the pictures of each character are obtained, sequentially inputting the pictures into a pre-trained OCR character recognition model according to the character sequence for recognition, and combining the recognized characters into a questionnaire title according to the character sequence. It should be noted that the word order refers to the order scanned during the scanning process.
After the questionnaire titles are determined, prestored questionnaire information corresponding to the questionnaire titles can be searched, and the questionnaire information comprises topic content information and topic position information corresponding to each topic. It should be noted that the questionnaire may include multiple topics, and each topic has a fixed position in the questionnaire, that is, topic position information.
Each topic includes a plurality of topic elements, and the types of topics are different, and the types and numbers of the topic elements included are also different. For example, a choice question, the topic element would be: question stem, options and answering content; the topic elements of the text topic would be: the question stem and the answering content; the topic elements of the matrix topic would then be: question stem, row attribute, column attribute and table answering content. Note that these title elements are included in the title content information.
Therefore, for any topic in the questionnaire information, a topic feature template of the topic is generated based on topic content information and topic position information, such as fig. 2, fig. 3, and fig. 4, where fig. 2 is a topic feature template schematic diagram of a single topic provided in an embodiment of the present application, fig. 3 is a topic feature template schematic diagram of a text topic provided in an embodiment of the present application, and fig. 4 is a topic feature template schematic diagram of a matrix topic provided in an embodiment of the present application.
And finally, combining the generated topic feature templates corresponding to each topic to obtain a questionnaire feature identification template corresponding to a questionnaire topic, as shown in fig. 5, where fig. 5 is a schematic diagram of a questionnaire feature identification template provided in an embodiment of the present application.
It should be noted that the types of topics may include, but are not limited to, single-choice topics, multiple-choice topics, compound single-choice topics, compound multiple-choice topics, text topics, matrix topics, etc., and the foregoing are merely specific examples for convenience of description.
103, performing topic dimension segmentation on the preprocessed questionnaire picture based on the questionnaire feature recognition template to obtain topic pictures of all topics in the questionnaire picture and topic element pictures in all topic pictures.
In this step, the boundary of each question in the questionnaire picture can be identified based on the question feature template of each question; then cutting the questionnaire picture based on the identified boundaries of all questions to obtain question pictures of all the questions in the questionnaire picture; and finally, for any topic, cutting topic elements in the topic picture corresponding to the topic based on the topic feature template corresponding to the topic to obtain the topic element picture in the topic picture corresponding to the topic.
Specifically, the questionnaire feature recognition template generated in step 102 includes the topic feature templates of the topics, and certainly includes the position relationships between the topic feature templates and the respective size information, so that the boundaries of the topics in the questionnaire picture can be determined based on the position relationships and the respective size information.
And for the cutting of the theme elements, the theme picture can be cut according to the theme elements contained in the corresponding theme feature template.
And step 104, recognizing characters in the theme element pictures by utilizing a pre-trained OCR character recognition model for any theme element picture.
In this step, for any topic element picture, the topic element picture is divided into single character pictures; and then sequentially identifying the characters in the character pictures by utilizing a pre-trained OCR character identification model according to the position sequence of each character picture in the title element picture to obtain the characters corresponding to each character picture.
It should be noted that, the text cutting and recognition process may refer to the recognition process for the questionnaire titles, and details are not described here.
And 105, writing back the identified characters according to the position relationship among the characters in the topic element pictures, the position relationship among the topic element pictures in the topic pictures and the position relationship among the topic pictures in the questionnaire pictures to obtain the input questionnaire information.
In this step, the identified characters can be combined into corresponding topic element information according to the position relationship among the characters in the topic element picture; combining the combined topic element information into corresponding topic information according to the position relation among the topic element pictures in the topic pictures; and combining all question information into the input questionnaire information according to the position relation among all question pictures in the questionnaire pictures.
And 106, storing the questionnaire information into a database to complete the input of the questionnaire to be input.
In the embodiment, a questionnaire to be input is scanned to obtain a questionnaire picture of the questionnaire to be input, and the questionnaire picture is preprocessed according to a preprocessing algorithm; identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title; based on a questionnaire feature recognition template, performing question dimension segmentation on the preprocessed questionnaire picture to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures; for any topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model; writing back the identified characters according to the position relationship among the characters in the theme element pictures, the position relationship among the theme element pictures in the theme picture and the position relationship among the theme pictures in the questionnaire picture to obtain the input questionnaire information; and storing the questionnaire information into a database to complete the input of the questionnaire to be input. Based on the method, the question of the questionnaire is cut by using the template, the characters in each question are identified by using the OCR character identification model, and finally the questionnaire information is written back to obtain.
Example two
Fig. 6 is a schematic structural diagram of a questionnaire information entry device provided in the second embodiment of the present application. The questionnaire information input device provided by the embodiment of the application can execute the questionnaire information input method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. The apparatus may be implemented in software and/or hardware, and as shown in fig. 6, the questionnaire information entry apparatus specifically includes: the device comprises a scanning processing module 601, a template obtaining module 602, a topic segmentation module 603, a character recognition module 604, a character write-back module 605 and a storage module 606.
The system comprises a scanning processing module, a pre-processing module and a display module, wherein the scanning processing module is used for scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and pre-processing the questionnaire picture according to a pre-processing algorithm;
the template acquisition module is used for identifying the questionnaire titles in the preprocessed questionnaire pictures and acquiring questionnaire characteristic identification templates corresponding to the questionnaire titles;
the question segmentation module is used for segmenting question dimensionality of the preprocessed questionnaire picture based on the questionnaire feature recognition template to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures;
the character recognition module is used for recognizing characters in the topic element pictures by utilizing a pre-trained OCR character recognition model for any topic element picture;
the character write-back module is used for writing back the identified characters according to the position relationship among the characters in the topic element pictures, the position relationship among the topic element pictures in the topic pictures and the position relationship among the topic pictures in the questionnaire pictures to obtain the input questionnaire information;
and the storage module is used for storing the questionnaire information into the database so as to complete the input of the questionnaire to be input.
In the embodiment, a questionnaire to be input is scanned to obtain a questionnaire picture of the questionnaire to be input, and the questionnaire picture is preprocessed according to a preprocessing algorithm; identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title; based on a questionnaire feature recognition template, performing question dimension segmentation on the preprocessed questionnaire picture to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures; for any topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model; writing back the identified characters according to the position relationship among the characters in the theme element pictures, the position relationship among the theme element pictures in the theme picture and the position relationship among the theme pictures in the questionnaire picture to obtain the input questionnaire information; and storing the questionnaire information into a database to complete the input of the questionnaire to be input. Based on the method, the question of the questionnaire is cut by using the template, the characters in each question are identified by using the OCR character identification model, and finally the questionnaire information is written back to obtain.
Further, the scanning processing module comprises:
the questionnaire boundary detection and cutting unit is used for carrying out boundary detection on the questionnaire picture, identifying the questionnaire boundary of the questionnaire picture and cutting the questionnaire picture based on the questionnaire boundary;
and the noise reduction and binarization processing unit is used for carrying out global noise reduction processing on the cut questionnaire picture and carrying out binarization processing on the questionnaire picture after the global noise reduction processing.
Further, the template acquisition module comprises:
the head-up area picture acquisition unit is used for acquiring a head-up area picture of the questionnaire picture according to the size information of the preprocessed questionnaire picture;
and the questionnaire title recognition unit is used for recognizing the characters in the new heading area picture by utilizing a pre-trained OCR character recognition model to obtain the questionnaire titles in the questionnaire picture.
Further, the template obtaining module further comprises:
the questionnaire information searching unit is used for searching prestored questionnaire information corresponding to the questionnaire titles, and the questionnaire information comprises question content information and question position information corresponding to each question;
the question feature template generating unit is used for generating a question feature template of a question based on question content information and question position information for any question in the questionnaire information;
and the questionnaire feature identification template generating unit is used for combining the question feature templates corresponding to all the generated questions to obtain the questionnaire feature identification template corresponding to the questionnaire titles.
Further, the questionnaire feature identification template comprises a question feature template of each question;
the title segmentation module comprises:
the question boundary identification unit is used for identifying the boundary of each question in the questionnaire picture based on the question feature template of each question;
the questionnaire picture cutting unit is used for cutting the questionnaire picture based on the identified boundaries of all questions to obtain the question pictures of all the questions in the questionnaire picture;
and the title element cutting unit is used for cutting the title elements in the title picture corresponding to the title based on the title feature template corresponding to the title to obtain the title element picture in the title picture corresponding to the title.
Further, the character recognition module includes:
the character cutting unit is used for dividing the theme element picture into single character pictures of characters for any theme element picture;
and the character recognition unit is used for sequentially recognizing characters in the character pictures by utilizing a pre-trained OCR character recognition model according to the position sequence of each character picture in the title element picture to obtain the characters corresponding to each character picture.
Further, the text write-back module comprises:
the title element write-back unit is used for combining the identified characters into corresponding title element information according to the position relation among the characters in the title element picture;
the topic information write-back unit is used for combining the combined topic element information into corresponding topic information according to the position relation among the topic element pictures in the topic pictures;
and the questionnaire information write-back unit is used for combining all the question information into the input questionnaire information according to the position relationship among all the question pictures in the questionnaire pictures.
EXAMPLE III
Fig. 7 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, as shown in fig. 7, the electronic device includes a processor 710, a memory 720, an input device 730, and an output device 740; the number of the processors 710 in the electronic device may be one or more, and one processor 710 is taken as an example in fig. 7; the processor 710, the memory 720, the input device 730, and the output device 740 in the electronic apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 7.
The memory 720 is used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the questionnaire information entry method in the embodiment of the present invention (for example, the calling module 201, the executing module 202, and the determining module 207 in the questionnaire information entry device). The processor 710 executes software programs, instructions and modules stored in the memory 720, so as to execute various functional applications and data processing of the electronic device, that is, to implement the above-mentioned questionnaire information entry method:
scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title;
based on a questionnaire feature recognition template, performing question dimension segmentation on the preprocessed questionnaire picture to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures;
for any topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model;
writing back the identified characters according to the position relationship among the characters in the subject element pictures, the position relationship among the subject element pictures in the subject pictures and the position relationship among the subject pictures in the questionnaire pictures to obtain the input questionnaire information;
and storing the questionnaire information into a database to complete the input of the questionnaire to be input.
The memory 720 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 720 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 720 may further include memory located remotely from the processor 710, which may be connected to an electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Example four
A fourth embodiment of the present application further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a method for questionnaire information entry, the method including:
scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title;
based on a questionnaire feature recognition template, performing question dimension segmentation on the preprocessed questionnaire picture to obtain question pictures of all questions in the questionnaire picture and question element pictures in all the question pictures;
for any topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model;
writing back the identified characters according to the position relationship among the characters in the subject element pictures, the position relationship among the subject element pictures in the subject pictures and the position relationship among the subject pictures in the questionnaire pictures to obtain the input questionnaire information;
and storing the questionnaire information into a database to complete the input of the questionnaire to be input.
Of course, the storage medium provided in the embodiments of the present application contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also execute related operations in the questionnaire information entry method provided in any embodiment of the present application.
From the above description of the embodiments, it is obvious for those skilled in the art that the present application can be implemented by software and necessary general hardware, and certainly can be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods of the embodiments of the present application.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the application.
It is to be noted that the foregoing is only illustrative of the presently preferred embodiments and application of the principles of the present invention. Those skilled in the art will appreciate that the present application is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the appended claims.

Claims (10)

1. A questionnaire information entry method, characterized in that the method comprises:
scanning a questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
identifying a questionnaire title in the preprocessed questionnaire picture, and acquiring a questionnaire feature identification template corresponding to the questionnaire title;
based on the questionnaire feature recognition template, performing topic dimension segmentation on the preprocessed questionnaire pictures to obtain topic pictures of all topics in the questionnaire pictures and topic element pictures in all the topic pictures;
for any one topic element picture, recognizing characters in the topic element picture by utilizing a pre-trained OCR character recognition model;
writing back the identified characters according to the position relationship among characters in the subject element pictures, the position relationship among the subject element pictures in the subject pictures and the position relationship among the subject pictures in the questionnaire pictures to obtain the input questionnaire information;
and storing the questionnaire information into a database to complete the input of the questionnaire to be input.
2. The method according to claim 1, wherein the pre-processing the questionnaire picture according to a pre-processing algorithm comprises:
carrying out boundary detection on the questionnaire picture, identifying the questionnaire boundary of the questionnaire picture, and cutting the questionnaire picture based on the questionnaire boundary;
and carrying out global noise reduction processing on the cut questionnaire picture, and carrying out binarization processing on the questionnaire picture after the global noise reduction processing.
3. The method of claim 1, wherein identifying the questionnaire title in the pre-processed questionnaire picture comprises:
acquiring a head-up area picture of the questionnaire picture according to the size information of the preprocessed questionnaire picture;
recognizing characters in the head-up area picture by using a pre-trained OCR character recognition model to obtain a questionnaire title in the questionnaire picture.
4. The method according to claim 1, wherein the obtaining of the questionnaire feature recognition template corresponding to the questionnaire title comprises:
searching prestored questionnaire information corresponding to the questionnaire titles, wherein the questionnaire information comprises topic content information and topic position information corresponding to each topic;
for any topic in the questionnaire information, generating a topic feature template of the topic based on the topic content information and the topic position information;
and combining the generated topic feature templates corresponding to each topic to obtain a questionnaire feature identification template corresponding to the questionnaire topic.
5. The method of claim 1, wherein the questionnaire feature identification template comprises a topic feature template for each topic;
the method for performing topic dimension segmentation on the preprocessed questionnaire picture based on the questionnaire feature recognition template to obtain topic pictures of all topics in the questionnaire picture and topic element pictures in all the topic pictures comprises the following steps:
identifying the boundary of each topic in the questionnaire picture based on the topic feature template of each topic;
cutting the questionnaire picture based on the identified boundaries of all questions to obtain question pictures of all the questions in the questionnaire picture;
and for any topic, cutting the topic elements in the topic picture corresponding to the topic based on the topic feature template corresponding to the topic to obtain the topic element picture in the topic picture corresponding to the topic.
6. The method of claim 1, wherein said identifying the text in said topic element picture using a pre-trained OCR text recognition model for any of said topic element pictures comprises:
for any one subject element picture, dividing the subject element picture into single character pictures;
and sequentially identifying the characters in the character pictures by utilizing a pre-trained OCR character identification model according to the position sequence of each character picture in the title element picture to obtain the characters corresponding to each character picture.
7. The method according to claim 1, wherein the writing back of the recognized text according to the position relationship among the texts in the topic element picture, the position relationship among the topic element pictures in the topic picture, and the position relationship among the topic pictures in the questionnaire picture to obtain the entered questionnaire information comprises:
combining the identified characters into corresponding topic element information according to the position relation among the characters in the topic element picture;
combining the combined topic element information into corresponding topic information according to the position relation among the topic element pictures in the topic pictures;
and combining all question information into input questionnaire information according to the position relation among all question pictures in the questionnaire pictures.
8. A questionnaire information entry device, characterized in that the device comprises:
the scanning processing module is used for scanning the questionnaire to be input to obtain a questionnaire picture of the questionnaire to be input, and preprocessing the questionnaire picture according to a preprocessing algorithm;
the template acquisition module is used for identifying questionnaire titles in the preprocessed questionnaire pictures and acquiring questionnaire feature identification templates corresponding to the questionnaire titles;
the question segmentation module is used for segmenting question dimensions of the preprocessed questionnaire pictures based on the questionnaire feature recognition template to obtain question pictures of all questions in the questionnaire pictures and question element pictures in all the question pictures;
the character recognition module is used for recognizing characters in the topic element pictures by utilizing a pre-trained OCR character recognition model for any topic element picture;
the text write-back module is used for writing back the identified text according to the position relationship among the texts in the topic element pictures, the position relationship among the topic element pictures in the topic pictures and the position relationship among the topic pictures in the questionnaire pictures to obtain the input questionnaire information;
and the storage module is used for storing the questionnaire information into a database so as to complete the input of the questionnaire to be input.
9. An electronic device, comprising:
one or more processors;
a storage device to store one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the questionnaire information entry method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which program, when executed by a processor, implements the questionnaire information entry method of any one of claims 1-7.
CN202211279363.4A 2022-10-19 2022-10-19 Questionnaire information input method and device, electronic equipment and storage medium Pending CN115565193A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211279363.4A CN115565193A (en) 2022-10-19 2022-10-19 Questionnaire information input method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211279363.4A CN115565193A (en) 2022-10-19 2022-10-19 Questionnaire information input method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115565193A true CN115565193A (en) 2023-01-03

Family

ID=84767312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211279363.4A Pending CN115565193A (en) 2022-10-19 2022-10-19 Questionnaire information input method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115565193A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858634A (en) * 2023-02-27 2023-03-28 长沙冉星信息科技有限公司 Questionnaire information processing method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858634A (en) * 2023-02-27 2023-03-28 长沙冉星信息科技有限公司 Questionnaire information processing method

Similar Documents

Publication Publication Date Title
Wilkinson et al. Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections
CN109472207B (en) Emotion recognition method, device, equipment and storage medium
CN109635805B (en) Image text positioning method and device and image text identification method and device
CN110889402A (en) Business license content identification method and system based on deep learning
CN110175609B (en) Interface element detection method, device and equipment
CN105701488A (en) Identity card identification method
CN109272440B (en) Thumbnail generation method and system combining text and image content
JP2002279433A (en) Method and device for retrieving character in video
CN111753120A (en) Method and device for searching questions, electronic equipment and storage medium
CN111291572A (en) Character typesetting method and device and computer readable storage medium
CN110909123A (en) Data extraction method and device, terminal equipment and storage medium
CN111061887A (en) News character photo extraction method, device, equipment and storage medium
CN112699232A (en) Text label extraction method, device, equipment and storage medium
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
CN111353491A (en) Character direction determining method, device, equipment and storage medium
CN114565927A (en) Table identification method and device, electronic equipment and storage medium
CN115565193A (en) Questionnaire information input method and device, electronic equipment and storage medium
US10963690B2 (en) Method for identifying main picture in web page
CN113205046A (en) Method, system, device and medium for identifying question book
WO2020258669A1 (en) Website identification method and apparatus, and computer device and storage medium
CN114579796B (en) Machine reading understanding method and device
Yuan et al. An opencv-based framework for table information extraction
CN114155547B (en) Chart identification method, device, equipment and storage medium
CN115050025A (en) Knowledge point extraction method and device based on formula recognition
CN115221452A (en) Portal construction method, system, electronic equipment and medium based on visual configuration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination