CN110427412A - Topic read method, device, topic input device and computer storage medium - Google Patents
Topic read method, device, topic input device and computer storage medium Download PDFInfo
- Publication number
- CN110427412A CN110427412A CN201910569060.8A CN201910569060A CN110427412A CN 110427412 A CN110427412 A CN 110427412A CN 201910569060 A CN201910569060 A CN 201910569060A CN 110427412 A CN110427412 A CN 110427412A
- Authority
- CN
- China
- Prior art keywords
- topic
- identification
- image
- read method
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 238000004458 analytical method Methods 0.000 claims abstract description 101
- 230000008569 process Effects 0.000 claims abstract description 36
- 238000004590 computer program Methods 0.000 claims description 17
- 238000012360 testing method Methods 0.000 claims description 17
- 238000012300 Sequence Analysis Methods 0.000 claims description 11
- 238000005192 partition Methods 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 abstract description 7
- 238000010801 machine learning Methods 0.000 abstract description 6
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 10
- 238000012549 training Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 239000000969 carrier Substances 0.000 description 6
- 230000001186 cumulative effect Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000011017 operating method Methods 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000010411 cooking Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B7/00—Electrically-operated teaching apparatus or devices working with questions and answers
- G09B7/02—Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of topic read method, device, topic input device and computer storage mediums, wherein topic read method includes: the structure and content of identification purpose image;According to the corresponding empirical model of process of identification structure and content, item analysis experience library is generated;Identification is empirically executed with item analysis experience library;Structural data is generated according to the topic identified.According to the technical solution of the present invention, item analysis is carried out using artificial intelligence (machine learning) technology, topic identification process is instructed by the experience library of foundation, new experience can also be constantly accumulated during being identified to be trained experience library, item analysis (identification) speed can be significantly improved, the time required for topic identification/typing is reduced, the efficiency and accuracy of topic Input Process are promoted.
Description
Technical field
The present invention relates to technical field of data processing, read in particular to a kind of topic read method, a kind of topic
Device, a kind of topic input device and a kind of computer readable storage medium.
Background technique
Education and the combination of internet are being carried out by current country, and school examination, that learning link is frequently necessary to network is online
Examination, on-line training way, other the case where there are also online questionnaire test-types in society.These application scenarios are rising
Step section is often paper, the training handbook of only papery, and homework book, questionnaire etc. are needed papery this when
Paper, training handbook, the topic typing computer network system in homework book, questionnaire.In the related technology, the record of a little topics
Enter usually manual entry network system, this process generally require to expend a large amount of manpower and time for needing to expend also very
It is more.
In addition, any discussion of the whole instruction to background technique, not representing the background technique must be fields
The prior art that technical staff is known, not representing in the whole instruction to any discussion of the prior art think that this is existing
Technology must be widely known or certain common knowledge for constituting this field.
Summary of the invention
The present invention is directed to solve at least one of the technical problems existing in the prior art or related technologies.
For this purpose, an object of the present invention is to provide a kind of topic read methods.
It is another object of the present invention to propose a kind of topic reading device.
It is another object of the present invention to propose a kind of topic input device.
Yet another object of the invention is that proposing a kind of computer readable storage medium.
In the technical solution of the first aspect of the present invention, a kind of topic read method is proposed, comprising: identification purpose
The structure and content of image;According to the corresponding empirical model of process of identification structure and content, item analysis experience library is generated;With
Item analysis experience library empirically executes identification;Structural data is generated according to the topic identified.
In the technical scheme, item analysis (identification) is carried out using pictograph means of identification, in combination with data point
The empirical model that analysis process generates establishes item analysis experience library, and item analysis experience library is established according to topic identification process and quilt
For instructing topic identification process, the accumulation and update of experience are formed, steps up recognition efficiency and accuracy, and being capable of basis
Demand, which learns more perfect recognition methods by the means of machine learning, after training by certain time can significantly mention
High analyte (identification) speed reduces the time required for topic identification/typing, promotes the efficiency of topic Input Process and accurate
Degree.In addition, the application proposes simultaneously to identify topic target structure and content, the topic carrier of any typesetting can be carried out
Identification promotes topic recognition efficiency, reduces artificial investment.
It will be understood by a person skilled in the art that the vocabulary category such as experience/empirical model, training and study for being proposed in the application
In data analysis field or machine learning field, by the achievable operating method or algorithm support in above-mentioned field, the application
In the image of topic can be obtained from various topic carriers, for example, the paper of papery, training handbook, homework book, questionnaire tune
It tables look-up and waits topics carrier, many algorithms realization, such as convolutional Neural net can be used in the pictograph means of identification that the application proposes
Network or other types of deep learning algorithm.
In addition, topic read method according to the above embodiment of the present invention, can also have the following additional technical features:
In any of the above-described technical solution, optionally, the structure of identification purpose image is specifically included: with item analysis
Experience library empirically, obtains the image of topic, according to the layout information of image, topic type information and topic sequence information, identifies figure
The layout structure of picture, wherein layout structure includes the one or more of following items: the space of a whole page, topic type and topic sequence.
In the technical scheme, the image of image, that is, topic corresponding to topic carrier carries out structural analysis, to obtain version
Face structure situation, topic types situation and topic arrangement sequence situation.Due to various topic carriers often have some fixations or often
Mode identifies image according to pictograph means of identification to obtain the layout structure of image and facilitate subsequent progress
More accurate highly efficient item content identification, can preferably establish structuring number according to the layout structure information of acquisition
According to.In addition, in this process, item analysis experience library empirically, carries out the identification of image layout structure, help identifies and shape
It at the cumulative of experience and updates, recognition efficiency and accuracy can be stepped up.
In any of the above-described technical solution, optionally, the content of identification purpose image is specifically included: with item analysis
The target information of image empirically, according to the figure and/or text in layout structure and image, is identified in experience library,
In, target information includes the one or more of following items: stem content, options content, problem content, figure and blank
Place.
In the technical scheme, the layout structure combination pictograph means of identification based on image to the content in image into
Row analysis identification, can more precisely, obtain the targets such as stem, options, problem, figure, the blank space of topic letter more quickly
Breath, these particular contents are usually it is desirable that the effective information obtained, can build according to these target informations and layout structure
Vertical more complete electronic edition topic (set).In addition, in this process, item analysis experience library empirically, helps to identify
And form the cumulative of experience and update, recognition efficiency and accuracy can be stepped up.
It in any of the above-described technical solution, optionally, identifies the target information of image, specifically includes: is true according to topic sequence
The start-stop coordinate for determining topic, the image of topic is cut into according to start-stop coordinate, reads image using pictograph means of identification
Target information.
In the technical scheme, after knowing topic sequence, it can be accurately identified the start-stop coordinate of each topic on this basis
Position can cut out the image of single topic, if there is the topic to skip according to start-stop coordinate position in the image of topic
Then this topic can cut out two or more pictures, then the two or more pictures cut out are merged into a picture, and then utilize
Pictograph means of identification carries out content analysis to picture material, obtains stem, options, problem, the figure, blank space of topic
Etc. target informations, position and then rapidly extract the particular content of single topic using topic sequence, recognition efficiency is high, obtains
Topic is separate unit convenient for building structural data progress data loading.
In any of the above-described technical solution, optionally, identifies the space of a whole page of image, specifically include: with item analysis experience
Library empirically, using pictograph means of identification carries out the lateral blank area of image and/or longitudinal blank area and/or title
Identification, to obtain the child partition of image.
In the technical scheme, the image of topic is judged, judges whether the image meets paper, topic, test paper
The condition of class, to determine the information of the space of a whole page of correspondence image, using pictograph means of identification to picture space of a whole page transverse direction, Zong Xianglian
The condition analysis such as continuous blank area, existing title go out the child partition situation of image, for example, image is divided into Dan Lan, two
Situations such as column, three columns, four columns, to obtain the information such as layout and the title of topic carrier.In addition, the process is with item analysis
Experience library empirically, helps to promote recognition efficiency, additionally it is possible to form the cumulative of experience and update, step up recognition efficiency
And accuracy.
In any of the above-described technical solution, optionally, identifies the topic type of image, specifically include: on the basis of child partition
On, the topic type for including in image is identified using pictograph means of identification, wherein topic type includes subjective item and objective item, master
Sight topic includes at least question-and-answer problem, and objective item includes at least one or more of following items: True-False, single choice, multiple choice
And gap-filling questions.
In the technical scheme, on the basis of according to known layout and title (image region has been identified),
Using pictograph means of identification, identify the basic topic type of topic carrier, for example, True-False, single choice, multiple choice, gap-filling questions,
Question-and-answer problem etc..Word content acquisition is carried out according to topic type information and establishes structural data, can effectively promote data query speed
Degree reduces topic management difficulty.
In any of the above-described technical solution, optionally, identifies the topic sequence of image, specifically includes:
Empirically with item analysis experience library, the sequence of the topic in image is identified using pictograph means of identification
Number.
In the technical scheme, topic sequence identification is carried out in conjunction with experience library and be able to ascend recognition speed, above topic carrier
Usually there is topic serial number in topic, serial number is often continuous, can quickly recognize topic using pictograph means of identification
Serial number, for example, big topic, the identification process of small topic are usually big topic preceding, small topic is followed rear, this identification process can also generate
The routine information of some text numbers, and often combined with topic type, these information empirically save.
In any of the above-described technical solution, optionally, further includes: the hierarchical relationship between topic is determined according to serial number.
In the technical scheme, some topic carrier structures are more complicated, for example, there are the big topic of multiple tracks, per pass in paper
Big topic has the small topic of multiple tracks again below, in this case, can be analyzed according to the result of topic sequence analysis, between big topic, small topic
Between serial number it is often continuous, can analyze out it is big topic, small topic hierarchical relationship.
It is optionally, raw according to the corresponding empirical model of process of identification structure and content in any of the above-described technical solution
It at item analysis experience library, specifically includes: according to identification structure and identifying that the procedure of rule of content generates empirical model, according to warp
It tests model and generates one of printed page analysis experience library, analysis on test forms experience library and topic sequence analysis experience library or a variety of.
In the technical scheme, the identification of various topic carriers often has some fixations or common mode, will identify that
Meet use condition, available data record is got off, formed printed page analysis experience library, analysis on test forms experience library, topic sequence analysis
Small topic analysis experience library is inscribed greatly in experience library, and empirically module preferentially uses when for executing next identification mission, to be promoted
The speed of identification need not then repeat to be put in storage for the empirical mode being put in storage.
For example, during template recognition (analysis), situations such as analyzing Dan Lan, two columns, three columns, four columns, if these feelings
Condition, which is verified, to meet the requirements, then by space of a whole page column number, space of a whole page transverse direction, longitudinal space width, length, position etc. as a type of
Experience module is recorded.
For another example the topic types on basis have single choice, multiple choice, True-False, gap-filling questions, question-and-answer problem, these are as base
The matching of plinth, but many uncertain situations in wider application, these topic types are identified meet the requirements after,
It is recorded as topic types, and identifies the procedure of rule of these topics, be also automatically logged into analysis on test forms experience library.
In any of the above-described technical solution, optionally, structural data is generated according to the topic identified, is specifically also wrapped
It includes: identifying the content of topic target structure and topic from the image of topic using pictograph means of identification;According to topic
Structure and the content of topic generate structural data, and/or by Structure data entry database.
In the technical scheme, it reduces to improve the utilization rate of topic using difficulty, structure is carried out to the topic identified
Change processing convenient for inquiring and transferring at any time, the target information in papery topic carrier is enabled to be efficiently utilized and access calculating
Machine network system.
In the technical solution of the second aspect of the present invention, a kind of topic reading device is proposed, the topic reading device
Including processor, when processor executes computer program realization topic read method as disclosed in any of the above-described technical solution
Step.Therefore, which has the advantageous effects of any of the above-described topic read method, no longer superfluous herein
It states.
In the technical solution of the third aspect of the present invention, a kind of topic input device is proposed, comprising: control module,
Topic input database for receiving the image of topic and/or will identify that;Memory is configured with database in memory,
Database is for storing topic;Such as the topic reading device that above-mentioned technical proposal provides, which includes processor,
Processor is realized as disclosed in any of the above-described technical solution when executing computer program the step of topic read method.Therefore,
The topic input device has the advantageous effects of any of the above-described topic read method, and details are not described herein.
In addition, in the technical scheme, control module is for carrying out data exchange and Row control: being swept for example, receiving
Retouch equipment or external incoming picture file;Picture file is passed into topic reading device;It receives verifying topic and reads dress
The analysis set is as a result, and be deposited into database.Database is used to record the lteral data of the topic identified, the topic including topic
Dry, options, problem, figure, answer etc..
In the technical solution of the fourth aspect of the present invention, a kind of computer readable storage medium is proposed, is stored thereon
There is computer program, the computer program is performed, and realizes the topic reading side as disclosed in any of the above-described technical solution
The step of method.
Additional aspect and advantage of the invention will become obviously in following description section, or practice through the invention
Recognize.
Detailed description of the invention
Above-mentioned and/or additional aspect of the invention and advantage will become from the description of the embodiment in conjunction with the following figures
Obviously and it is readily appreciated that, in which:
Fig. 1 shows the schematic flow diagram of topic read method according to an embodiment of the invention;
Fig. 2 shows the schematic block diagrams of topic reading device according to an embodiment of the invention;
Fig. 3 shows the schematic block diagram of topic reading device according to another embodiment of the invention;
Fig. 4 shows the schematic block diagram of topic input device according to an embodiment of the invention.
Specific embodiment
To better understand the objects, features and advantages of the present invention, with reference to the accompanying drawing and specific real
Applying mode, the present invention is further described in detail.It should be noted that in the absence of conflict, the implementation of the application
Feature in example and embodiment can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, still, the present invention may be used also
To be implemented using other than the one described here other modes, therefore, protection scope of the present invention is not by described below
Specific embodiment limitation.
Detection circuit according to the present invention and cooking apparatus are specifically described below with reference to Fig. 1 to Fig. 4.
As shown in Figure 1, topic read method according to an embodiment of the invention, comprising: step S102 identifies topic
Image structure and content;Step S104 generates topic point according to the corresponding empirical model of process of identification structure and content
Analysis experience library;S106 empirically executes identification with item analysis experience library;S108 generates structure according to the topic identified
Change data.
In the technical scheme, item analysis (identification) is carried out using pictograph means of identification, in combination with data point
The empirical model that analysis process generates establishes item analysis experience library, and item analysis experience library is established according to topic identification process and quilt
For instructing topic identification process, the accumulation and update of experience are formed, steps up recognition efficiency and accuracy, and being capable of basis
Demand, which learns more perfect recognition methods by the means of machine learning, after training by certain time can significantly mention
High analyte (identification) speed reduces the time required for topic identification/typing, promotes the efficiency of topic Input Process and accurate
Degree.In addition, the application proposes simultaneously to identify topic target structure and content, the topic carrier of any typesetting can be carried out
Identification promotes topic recognition efficiency, reduces artificial investment.
It will be understood by a person skilled in the art that the vocabulary category such as experience/empirical model, training and study for being proposed in the application
In data analysis field or machine learning field, by the achievable operating method or algorithm support in above-mentioned field, the application
In the image of topic can be obtained from various topic carriers, for example, the paper of papery, training handbook, homework book, questionnaire tune
It tables look-up and waits topics carrier, many algorithms realization, such as convolutional Neural net can be used in the pictograph means of identification that the application proposes
Network or other types of deep learning algorithm.
In addition, topic read method according to the above embodiment of the present invention, can also have the following additional technical features:
In any of the above-described technical solution, optionally, the structure of identification purpose image in step S102 is specifically included:
Empirically with item analysis experience library, the image of topic is obtained, according to the layout information of image, topic type information and topic sequence letter
Breath, identifies the layout structure of image, wherein layout structure includes the one or more of following items: the space of a whole page, topic type and topic
Sequence.
In the technical scheme, the image of image, that is, topic corresponding to topic carrier carries out structural analysis, to obtain version
Face structure situation, topic types situation and topic arrangement sequence situation.Due to various topic carriers often have some fixations or often
Mode identifies image according to pictograph means of identification to obtain the layout structure of image and facilitate subsequent progress
More accurate highly efficient item content identification, can preferably establish structuring number according to the layout structure information of acquisition
According to.In addition, in this process, item analysis experience library empirically, carries out the identification of image layout structure, help identifies and shape
It at the cumulative of experience and updates, recognition efficiency and accuracy can be stepped up.
In any of the above-described technical solution, optionally, the content of identification purpose image in step S102 is specifically included:
Empirically with item analysis experience library, according to the figure and/or text in layout structure and image, the mesh of image is identified
Mark information, wherein target information includes the one or more of following items: stem content, options content, problem content, figure
Shape and blank space.
In the technical scheme, the layout structure combination pictograph means of identification based on image to the content in image into
Row analysis identification, can more precisely, obtain the targets such as stem, options, problem, figure, the blank space of topic letter more quickly
Breath, these particular contents are usually it is desirable that the effective information obtained, can build according to these target informations and layout structure
Vertical more complete electronic edition topic (set).In addition, in this process, item analysis experience library empirically, helps to identify
And form the cumulative of experience and update, recognition efficiency and accuracy can be stepped up.
It in any of the above-described technical solution, optionally, identifies the target information of image, specifically includes: is true according to topic sequence
The start-stop coordinate for determining topic, the image of topic is cut into according to start-stop coordinate, reads image using pictograph means of identification
Target information.
In the technical scheme, after knowing topic sequence, it can be accurately identified the start-stop coordinate of each topic on this basis
Position can cut out the image of single topic, if there is the topic to skip according to start-stop coordinate position in the image of topic
Then this topic can cut out two or more pictures, then the two or more pictures cut out are merged into a picture, and then utilize
Pictograph means of identification carries out content analysis to picture material, obtains stem, options, problem, the figure, blank space of topic
Etc. target informations, position and then rapidly extract the particular content of single topic using topic sequence, recognition efficiency is high, obtains
Topic is separate unit convenient for building structural data progress data loading.
In any of the above-described technical solution, optionally, identifies the space of a whole page of image, specifically include: with item analysis experience
Library empirically, using pictograph means of identification carries out the lateral blank area of image and/or longitudinal blank area and/or title
Identification, to obtain the child partition of image.
In the technical scheme, the image of topic is judged, judges whether the image meets paper, topic, test paper
The condition of class, to determine the information of the space of a whole page of correspondence image, using pictograph means of identification to picture space of a whole page transverse direction, Zong Xianglian
The condition analysis such as continuous blank area, existing title go out the child partition situation of image, for example, image is divided into Dan Lan, two
Situations such as column, three columns, four columns, to obtain the information such as layout and the title of topic carrier.In addition, the process is with item analysis
Experience library empirically, helps to promote recognition efficiency, additionally it is possible to form the cumulative of experience and update, step up recognition efficiency
And accuracy.
In any of the above-described technical solution, optionally, identifies the topic type of image, specifically include: on the basis of child partition
On, the topic type for including in image is identified using pictograph means of identification, wherein topic type includes subjective item and objective item, master
Sight topic includes at least question-and-answer problem, and objective item includes at least one or more of following items: True-False, single choice, multiple choice
And gap-filling questions.
In the technical scheme, on the basis of according to known layout and title (image region has been identified),
Using pictograph means of identification, identify the basic topic type of topic carrier, for example, True-False, single choice, multiple choice, gap-filling questions,
Question-and-answer problem etc..Word content acquisition is carried out according to topic type information and establishes structural data, can effectively promote data query speed
Degree reduces topic management difficulty.
In any of the above-described technical solution, optionally, identifies the topic sequence of image, specifically includes:
Empirically with item analysis experience library, the sequence of the topic in image is identified using pictograph means of identification
Number.
In the technical scheme, topic sequence identification is carried out in conjunction with experience library and be able to ascend recognition speed, above topic carrier
Usually there is topic serial number in topic, serial number is often continuous, can quickly recognize topic using pictograph means of identification
Serial number, for example, big topic, the identification process of small topic are usually big topic preceding, small topic is followed rear, this identification process can also generate
The routine information of some text numbers, and often combined with topic type, these information empirically save.
In any of the above-described technical solution, optionally, further includes: the hierarchical relationship between topic is determined according to serial number.
In the technical scheme, some topic carrier structures are more complicated, for example, there are the big topic of multiple tracks, per pass in paper
Big topic has the small topic of multiple tracks again below, in this case, can be analyzed according to the result of topic sequence analysis, between big topic, small topic
Between serial number it is often continuous, can analyze out it is big topic, small topic hierarchical relationship.
In any of the above-described technical solution, optionally, the process in step S104 according to identification structure and content is corresponding
Empirical model generates item analysis experience library, specifically includes: according to identification structure and identifying that the procedure of rule of content generates experience
Model, rule of thumb model generate printed page analysis experience library, analysis on test forms experience library and topic sequence analysis one of experience library or
It is a variety of.
In the technical scheme, the identification of various topic carriers often has some fixations or common mode, will identify that
Meet use condition, available data record is got off, formed printed page analysis experience library, analysis on test forms experience library, topic sequence analysis
Small topic analysis experience library is inscribed greatly in experience library, and empirically module preferentially uses when for executing next identification mission, to be promoted
The speed of identification need not then repeat to be put in storage for the empirical mode being put in storage.
For example, during template recognition (analysis), situations such as analyzing Dan Lan, two columns, three columns, four columns, if these feelings
Condition, which is verified, to meet the requirements, then by space of a whole page column number, space of a whole page transverse direction, longitudinal space width, length, position etc. as a type of
Experience module is recorded.
For another example the topic types on basis have single choice, multiple choice, True-False, gap-filling questions, question-and-answer problem, these are as base
The matching of plinth, but many uncertain situations in wider application, these topic types are identified meet the requirements after,
It is recorded as topic types, and identifies the procedure of rule of these topics, be also automatically logged into analysis on test forms experience library.
In any of the above-described technical solution, optionally, structural data is generated according to the topic identified in step S108,
Specifically further include: identify the content of topic target structure and topic from the image of topic using pictograph means of identification;Root
Structural data is generated according to topic target structure and the content of topic, and/or by Structure data entry database.
In the technical scheme, it reduces to improve the utilization rate of topic using difficulty, structure is carried out to the topic identified
Change processing convenient for inquiring and transferring at any time, the target information in papery topic carrier is enabled to be efficiently utilized and access calculating
Machine network system.
As shown in Fig. 2, topic reading device 200 according to an embodiment of the invention, in this embodiment according to upper
The topic read method for stating technical solution offer establishes corresponding program module and is able to carry out topic reading, specifically, the topic
Reading device 200 includes: AI analysis module 202 and AI study module 204, wherein AI (Artificial Intelligence,
Artificial intelligence) i.e. artificial intelligence technology;AI analysis module 202 is responsible for carrying out printed page analysis, analysis on test forms, the analysis of topic sequence, big topic
Small topic analysis and item content analysis;AI study module 204 then establishes corresponding printed page analysis experience library, topic according to analytic process
Type analysis experience library, topic sequence analyze experience library and inscribe small topic analysis experience library greatly.The task of AI analysis module 202 execution when
Wait the experience library that can be formed with reference to AI study module 204.
The method executed in AI analysis module 202 includes:
Printed page analysis, to received picture be made whether to meet paper, topic, class of answering the questions in a test paper condition, mainly utilize figure
Text region means are lateral to the picture space of a whole page, are longitudinally continuous the condition analysis such as blank area, existing title goes out Dan Lan, two
Situations such as column, three columns, four columns, to obtain the general layout and title of paper.
Analysis on test forms, using pictograph means of identification, identifies paper on the basis of known general layout and title
Basic topic type, such as single choice, multiple choice, gap-filling questions, question-and-answer problem etc..
Sequence analysis is inscribed, generally there are topic serial numbers for the topic above paper, and the serial number for inscribing small topic greatly is often continuous, benefit
Topic serial number can be identified with pictograph means of identification.
Inscribe the analysis of small topic greatly, with the presence of paper inscribe greatly, big topic has the case where small topic again below, on the basis that topic sequence is analyzed
On, it can analyze out the hierarchical relationship for inscribing small topic greatly.
Item content analysis can be accurately identified the start-stop coordinate position of each topic on the basis of inscribing sequence and analyzing,
According to start-stop coordinate position, the picture of single topic can be cut out, then this topic can cut out two if there is the topic to skip
Or multiple pictures, then the two or more pictures cut out are merged into a picture.And then utilize pictograph means of identification figure
Piece content carries out item content analysis, obtains the particular contents such as stem, options, problem, figure, the blank space of topic.
In AI study module 204, the identification of paper often has some fixations or common mode, this module to analyze AI
What module identified meet use condition, available data record is got off, and forms printed page analysis experience library, analysis on test forms experience
Small topic analysis experience library, is inscribed greatly at topic sequence analysis experience library in library, conduct when executing the identification mission of next time for AI analysis module 202
Experience module preferentially uses, to promote the speed of identification, the empirical mode being put in storage need not then be repeated to be put in storage.
The method executed in AI study module 204 includes:
The formation in printed page analysis experience library: printed page analysis module analysis goes out situations such as Dan Lan, two columns, three columns, four columns, if
These situations, which are finally verified, to meet the requirements, AI study module by space of a whole page column number, the space of a whole page laterally, longitudinal space width, length, position
It is recorded Deng as a type of experience module.
The formation in analysis on test forms experience library: the topic types on general basis have single choice, multiple choice, True-False, gap-filling questions,
Question-and-answer problem, the matching based on these, but many uncertain situations, these topic types quilts in wider application
It after identification meets the requirements, is also recorded as topic types, and identifies the procedure of rule of these topics, be also automatically logged into topic
Type analysis experience library.
The formation in topic sequence analysis experience library: during identification sequence, identifying satisfactory continuous topic sequence, than
Such as, 1,2,3 ...;One, two, three ...;A,B,C……;These topic sequences can include serial number Text region rule as module
Process record analyzes experience library to topic sequence.
Inscribe the formation in small topic analysis experience library greatly: big topic, the identification process of small topic are usually big topic preceding, and small topic follows
Afterwards, this identification process can also generate the routine information of some text numbers, and often combine with topic type, these information are made
For experience preservation.
As shown in figure 3, topic reading device 300 according to another embodiment of the invention, the topic reading device 300
Including processor 302, processor 302 realizes that the topic as disclosed in any of the above-described technical solution is read when executing computer program
The step of method.Therefore, which has the advantageous effects of any of the above-described topic read method, In
This is repeated no more.
As shown in figure 3, topic input device 400 according to an embodiment of the invention, comprising: control module 402 is used
In the topic input database that receives the image of topic and/or will identify that;Memory 404 is configured with database in memory,
Database is for storing topic;Such as the topic reading device 300 that above-mentioned technical proposal provides, which includes
Processor, processor realize the step of the topic read method as disclosed in any of the above-described technical solution when executing computer program
Suddenly.Therefore, which has the advantageous effects of any of the above-described topic read method, no longer superfluous herein
It states.
In addition, in the technical scheme, control module 402 is for carrying out data exchange and Row control: for example, receiving
Scanning device or external incoming picture file;Picture file is passed into topic reading device 300;Verifying topic is received to read
The analysis of device 300 is taken as a result, and in the database that is deposited into memory 404.Database is for recording the topic identified
Lteral data, stem, options, problem, figure, answer including topic etc..
One embodiment of the present of invention also defines a kind of computer readable storage medium, is stored thereon with computer journey
The step of sequence, the computer program are performed, and realize the topic read method as described in any of the above-described technical solution.Cause
This computer readable storage medium has the advantageous effects of any of the above-described topic read method, and details are not described herein.
According to the technique and scheme of the present invention, the topic by the paper of papery, training handbook, in homework book, questionnaire
It analyzed using pictograph identification technology by printed page analysis, analysis on test forms, topic sequence, inscribe small topic analysis greatly, item content is analyzed
Carry out digitlization Rapid input computer network system, during analysis formed printed page analysis experience library, analysis on test forms experience library,
Small topic analysis experience library is inscribed greatly in topic sequence analysis experience library, is saved the time for subsequent analysis task, is stepped up recognition efficiency.
Item analysis is carried out using artificial intelligence (machine learning) technology, topic identification process is instructed by the experience library of foundation, is carried out
New experience can also be constantly accumulated during identification to be trained experience library, can significantly improve item analysis (identification)
Speed reduces the time required for topic identification/typing, promotes the efficiency and accuracy of topic Input Process.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
It should be noted that in the claims, any reference symbol between parentheses should not be configured to power
The limitation that benefit requires.Word "comprising" does not exclude the presence of component or step not listed in the claims.Before component
Word "a" or "an" does not exclude the presence of multiple such components.The present invention can be by means of including several different components
It hardware and is realized by means of properly programmed computer.In the unit claims listing several devices, these are filled
Several in setting, which can be, to be embodied by the same item of hardware.The use of word first, second, and third is not
Indicate any sequence.These words can be construed to title.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the scope of the invention.
These are only the preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art
For member, the invention may be variously modified and varied.All within the spirits and principles of the present invention, it is made it is any modification,
Equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.
Claims (13)
1. a kind of topic read method characterized by comprising
The structure and content of identification purpose image;
According to the corresponding empirical model of process of the identification structure and the content, item analysis experience library is generated;
The identification is empirically executed with item analysis experience library;
Structural data is generated according to the topic identified.
2. topic read method according to claim 1, which is characterized in that the structure of identification purpose image is specific to wrap
It includes:
Empirically with item analysis experience library, the image of topic is obtained, is believed according to the layout information of described image, topic type
Breath and topic sequence information, identify the layout structure of described image, wherein the layout structure includes one or more of following items
It is a:
The space of a whole page, topic type and topic sequence.
3. topic read method according to claim 2, which is characterized in that the content of identification purpose image is specific to wrap
It includes:
Empirically with item analysis experience library, according to the figure and/or text in the layout structure and described image
Word identifies the target information of described image, wherein the target information includes the one or more of following items:
Stem content, options content, problem content, figure and blank space.
4. topic read method according to claim 3, which is characterized in that the target letter for identifying described image
Breath, specifically includes:
The start-stop coordinate that the topic is determined according to the topic sequence, the image of the topic is cut into according to the start-stop coordinate,
The target information of described image is read using pictograph means of identification.
5. topic read method according to claim 2, which is characterized in that identify the space of a whole page of described image, it is specific to wrap
It includes:
Empirically with item analysis experience library, using pictograph means of identification to the lateral blank area of described image
And/or longitudinal blank area and/or title are identified, to obtain the child partition of described image.
6. topic read method according to claim 5, which is characterized in that identify the topic type of described image, it is specific to wrap
It includes:
On the basis of the child partition, the topic type for including in described image is identified using pictograph means of identification, wherein
The topic type includes subjective item and objective item, and the subjective item includes at least question-and-answer problem, and the objective item includes at least following item
One or more of mesh: True-False, single choice, multiple choice and gap-filling questions.
7. topic read method according to claim 6, which is characterized in that identify the topic sequence of described image, it is specific to wrap
It includes:
Empirically with item analysis experience library, the topic in described image is identified using pictograph means of identification
Serial number.
8. topic read method according to claim 7, which is characterized in that further include:
The hierarchical relationship between the topic is determined according to the serial number.
9. topic read method according to any one of claim 1 to 8, which is characterized in that described according to identification
The corresponding empirical model of the process of structure and the content generates item analysis experience library, specifically includes:
According to identifying the structure and identifying that the procedure of rule of the content generates empirical model, generated according to the empirical model
One of printed page analysis experience library, analysis on test forms experience library and topic sequence analysis experience library are a variety of.
10. topic read method according to any one of claim 1 to 8, which is characterized in that the basis identified
Topic generates structural data, specifically further include:
The content of topic target structure and topic is identified from the image of the topic using pictograph means of identification;
Structural data is generated according to the topic target structure and the content of the topic, and/or the structural data is recorded
Enter database.
11. a kind of topic reading device, the topic reading device includes processor, which is characterized in that the processor executes
The step of topic read method as described in any one of claims 1 to 10 is realized when computer program.
12. a kind of topic input device characterized by comprising
Control module, the topic input database for receiving the image of topic and/or will identify that;
Memory is configured with the database in the memory, and the database is for storing the topic;
Topic reading device as claimed in claim 11.
13. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of being performed, realizing the topic read method as described in any one of claims 1 to 10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910569060.8A CN110427412A (en) | 2019-06-27 | 2019-06-27 | Topic read method, device, topic input device and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910569060.8A CN110427412A (en) | 2019-06-27 | 2019-06-27 | Topic read method, device, topic input device and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110427412A true CN110427412A (en) | 2019-11-08 |
Family
ID=68409792
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910569060.8A Pending CN110427412A (en) | 2019-06-27 | 2019-06-27 | Topic read method, device, topic input device and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110427412A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111553365A (en) * | 2020-04-30 | 2020-08-18 | 广东小天才科技有限公司 | Method and device for selecting questions, electronic equipment and storage medium |
CN112861864A (en) * | 2021-01-28 | 2021-05-28 | 广东国粒教育技术有限公司 | Topic entry method, topic entry device, electronic device and computer-readable storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932508A (en) * | 2018-08-13 | 2018-12-04 | 杭州大拿科技股份有限公司 | A kind of topic intelligent recognition, the method and system corrected |
CN109634961A (en) * | 2018-12-05 | 2019-04-16 | 杭州大拿科技股份有限公司 | A kind of paper sample generating method, device, electronic equipment and storage medium |
-
2019
- 2019-06-27 CN CN201910569060.8A patent/CN110427412A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932508A (en) * | 2018-08-13 | 2018-12-04 | 杭州大拿科技股份有限公司 | A kind of topic intelligent recognition, the method and system corrected |
CN109634961A (en) * | 2018-12-05 | 2019-04-16 | 杭州大拿科技股份有限公司 | A kind of paper sample generating method, device, electronic equipment and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111553365A (en) * | 2020-04-30 | 2020-08-18 | 广东小天才科技有限公司 | Method and device for selecting questions, electronic equipment and storage medium |
CN111553365B (en) * | 2020-04-30 | 2023-11-24 | 广东小天才科技有限公司 | Question selection method and device, electronic equipment and storage medium |
CN112861864A (en) * | 2021-01-28 | 2021-05-28 | 广东国粒教育技术有限公司 | Topic entry method, topic entry device, electronic device and computer-readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Passi et al. | Data vision: Learning to see through algorithmic abstraction | |
Della Croce et al. | A genetic algorithm for the job shop problem | |
CN108229478A (en) | Image, semantic segmentation and training method and device, electronic equipment, storage medium and program | |
CN109977230B (en) | Selected question error cause analysis method suitable for adaptive teaching | |
CN109408821B (en) | Corpus generation method and device, computing equipment and storage medium | |
CN102436547A (en) | Wrong sentence statistical method and system for teaching | |
CN110288007A (en) | The method, apparatus and electronic equipment of data mark | |
Kordaki et al. | Digital storytelling as an effective framework for the development of computational thinking skills | |
Leavy | Feminist content analysis and representative characters | |
CN116541538B (en) | Intelligent learning knowledge point mining method and system based on big data | |
CN110427412A (en) | Topic read method, device, topic input device and computer storage medium | |
CN112949766A (en) | Target area detection model training method, system, device and medium | |
CN112182308A (en) | Multi-feature fusion depth knowledge tracking method and system based on multi-thermal coding | |
CN106326086B (en) | Crucial running log extracting method and device | |
Zehra et al. | Student misconceptions of dynamic programming | |
CN111405314A (en) | Information processing method, device, equipment and storage medium | |
CN112905451B (en) | Automatic testing method and device for application program | |
CN110675705A (en) | Automatic generation method of geometric auxiliary line | |
Olague et al. | Hands-on artificial evolution through brain programming | |
KR102589553B1 (en) | Painting emotion learning method and apparatus | |
Gutbrod et al. | The business experiments navigator (ben) | |
Thabet et al. | Towards intelligent serious games: deep knowledge tracing with hybrid prediction models | |
CN108460453A (en) | It is a kind of to be used for data processing method, the apparatus and system that CTC is trained | |
CN113837167A (en) | Text image recognition method, device, equipment and storage medium | |
Kaptein et al. | The affective storyteller: using character emotion to influence narrative generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191108 |
|
WD01 | Invention patent application deemed withdrawn after publication |