CN113762223B

CN113762223B - Question splitting model training method, question splitting method and related device

Info

Publication number: CN113762223B
Application number: CN202111311793.5A
Authority: CN
Inventors: 单海蛟
Original assignee: Beijing Century TAL Education Technology Co Ltd
Current assignee: Beijing Century TAL Education Technology Co Ltd
Priority date: 2021-11-08
Filing date: 2021-11-08
Publication date: 2022-02-15
Anticipated expiration: 2041-11-08
Also published as: CN113762223A

Abstract

The present disclosure provides a question splitting model training method, a question splitting method and a related device, wherein the question splitting model training method comprises: acquiring a subject picture to be split in training, wherein each text line of the subject picture to be split in training is marked with a text line reference category; according to the picture of the subject to be split, acquiring the text line coordinate characteristics of the text line coordinates of the subject to be split, the text line content characteristics of the text line contents of the subject to be split and the text line picture characteristics corresponding to each text line; and training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model. The problem splitting model split problem trained and completed by the problem splitting model training method can improve the accuracy of problem splitting.

Description

Question splitting model training method, question splitting method and related device

Technical Field

The embodiment of the disclosure relates to the field of computers, in particular to a question splitting model training method, a question splitting method and a related device.

Background

With the development of computer technology and deep learning, assisted learning and teaching by computers and networks has become a trend, and with the widespread application of artificial intelligence, computer vision, multi-mode and other technologies, automatic item splitting of items in test paper pictures has become possible, for example: descriptions of related art are described in patents CN111652141A and CN 108304562A.

However, the existing topic splitting method has low accuracy, such as:

the image recognition result is used for splitting, the image recognition is firstly carried out on the test paper image, then the text splitting is carried out on the recognized image, and when the recognition result lacks keywords, errors easily occur.

The image structure information is utilized to split, the topic structure is detected, the detected images are classified, some topics are very similar from the page, but the topic types are very different due to the difference of text contents, so that the topic types are difficult to distinguish well only by utilizing the image classification mode; when there is a topic spread, only the post-processing logic can be used to combine the same topic, and errors are easy to occur.

Therefore, how to improve the accuracy of the problem resolution is an urgent technical problem to be solved.

Disclosure of Invention

The technical problem solved by the embodiment of the disclosure is to provide a question splitting model training method, which can improve the accuracy of question splitting by using the trained question splitting model to split questions.

In order to solve the above problem, an embodiment of the present disclosure provides a method for training a question splitting model, including:

acquiring a subject picture to be split in training, wherein each text line of the subject picture to be split in training is marked with a text line reference category;

according to the picture of the subject to be split, acquiring the text line coordinate characteristics of the text line coordinates of the subject to be split, the text line content characteristics of the text line contents of the subject to be split and the text line picture characteristics corresponding to each text line;

and training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

Compared with the prior art, the technical scheme of the disclosure has the following advantages:

according to the title splitting model training method provided by the embodiment of the disclosure, when a title splitting model is trained, firstly, a title picture to be split for training is obtained, and each text line of the title picture to be split for training is marked with a text line reference category; then according to the picture of the subject to be split, acquiring the text line coordinate characteristics of the text line coordinates of the subject to be split, the text line content characteristics of the text line contents of the subject to be split and the text line picture characteristics corresponding to each text line; and finally, training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

It can be seen that the title splitting model training method provided by the embodiment of the present disclosure trains the title splitting model according to the text line picture feature, the text line coordinate feature, the text line content feature, and the text line reference category of the training title picture to be split, and in the process of model training, the prediction of the category of each text line of the training title picture to be split is realized by adopting a manner of combining the text line picture feature, the text line content feature, and the text line coordinate feature, and the accuracy of the category of the predicted text line and the content of the text line can be ensured by using the information contained in the text line picture feature, the text line content feature, and the text line coordinate feature, so as to ensure the accuracy of model training, thereby further reducing the influence on the accuracy of title splitting caused by inaccurate acquisition of specific content in the training title picture to be split, the accuracy of the question splitting is improved.

In an alternative scheme, the title splitting model training method provided by the embodiment of the disclosure obtains the reference categories of text regions of each text region of the picture of the title to be split, where the text regions include at least one text line; the text line reference categories of all text lines in the text region are obtained according to all the text region reference categories of all the text regions, so that the text line reference categories of all the text lines in the text region can be quickly and accurately obtained through the obtained text region reference categories of all the text regions of the subject picture to be split, the difficulty and the workload of obtaining the text line reference categories are reduced, and the cost of training the subject splitting model is reduced.

Drawings

FIG. 1 is a flow chart of a topic splitting model training method provided by the embodiment of the present disclosure;

FIG. 2 is a diagram illustrating an example of a topic picture to be split trained obtained by a topic splitting model training method according to an embodiment of the present disclosure;

FIG. 3 is an exemplary diagram of labeling of a problem text region for training a problem picture to be split, obtained by the method for training a problem splitting model according to the embodiment of the present disclosure;

FIG. 4 is an exemplary diagram of position category labeling of a question text line for training a question picture to be split, acquired by a question splitting model training method provided in the embodiment of the present disclosure;

FIG. 5 is an exemplary diagram of reference category labeling of a text region of a training topic picture to be split, obtained by the topic splitting model training method provided by the embodiment of the present disclosure;

FIG. 6 is an exemplary diagram of a reference category labeling of a text line of a training topic picture to be split, which is acquired by the topic splitting model training method provided by the embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a step of training a topic splitting model in a topic splitting model training method according to an embodiment of the present disclosure;

FIG. 8 is an exemplary chart of topics split using the topic splitting model training method provided by the embodiments of the present disclosure;

FIG. 9 is a block diagram of a topic splitting model training device provided by an embodiment of the present disclosure;

fig. 10 is an alternative hardware device architecture for a device provided by embodiments of the present disclosure.

Detailed Description

In the prior art, the accuracy of the question splitting method is low.

In order to improve the accuracy of topic splitting, the embodiment of the present disclosure provides a method for training a topic splitting model, a topic splitting method and a related device, where the method for training a topic splitting model includes:

It can be seen that, in the title splitting model training method provided by the embodiment of the present disclosure, when the title splitting model is trained, a title picture to be trained and split is obtained first, and each text line of the title picture to be trained and split is labeled with a text line reference category; then according to the picture of the subject to be split, acquiring the text line coordinate characteristics of the text line coordinates of the subject to be split, the text line content characteristics of the text line contents of the subject to be split and the text line picture characteristics corresponding to each text line; and finally, training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

Thus, the title splitting model training method provided by the embodiment of the disclosure trains the title splitting model according to the text line picture feature, the text line coordinate feature, the text line content feature and the text line reference category of the training title picture to be split, and in the process of model training, the prediction of the category of each text line of the training title picture to be split is realized by adopting a mode of combining the text line picture feature, the text line content feature and the text line coordinate feature, the accuracy of the category of the predicted text line and the content of the text line can be ensured by utilizing the information contained in the text line picture feature, the text line content feature and the text line coordinate feature, and the accuracy of model training can be further ensured, thereby further reducing the influence on the accuracy of title splitting caused by inaccurate acquisition of specific content in the training title picture to be split, the accuracy of the question splitting is improved.

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description. It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Referring to fig. 1, fig. 1 is a flow chart of a topic splitting model training method according to an embodiment of the disclosure.

As shown in the figure, the title splitting model training method provided by the embodiment of the present disclosure includes the following steps:

step S10: and acquiring a training topic picture to be split, wherein each text line of the training topic picture to be split is marked with a text line reference category.

It is easy to understand that, in order to train the topic splitting model, the topic picture to be split for training is firstly acquired, and the acquired topic picture to be split for training may be any type of topic picture such as Chinese, mathematics, and english, and the disclosure does not make special limitation on the topic picture.

In order to realize model training, the acquired training picture of the question to be split needs to be labeled with a text line reference category, so that preparation is made for splitting the question subsequently, and splitting of the question is conveniently realized.

Specifically, the reference categories of the text lines can be obtained by labeling the reference categories of each text line, so that the acquisition of the training topic picture to be split is realized.

In a specific embodiment, in order to facilitate labeling of the text line reference category, the step of obtaining the text line reference category may include:

and acquiring a text region reference category of each text region of the training topic picture to be split, wherein the text region comprises at least one text line.

And acquiring the text line reference category of each text line in the text region according to each text region reference category of each text region.

Therefore, in order to obtain the text line reference category, the text region reference category is firstly obtained, then the text line reference category is further obtained through the text region reference category, and the workload is reduced through a gradual thinning mode.

Specifically, the method for obtaining the text region reference category of each text region may be manual labeling or automatic labeling by a machine.

The labeling method comprises the following steps: firstly, selecting a scope of a title in a picture, then carrying out picture frame on the corresponding scope, and simultaneously marking a text region reference category.

Of course, the text region may be determined based on a specific text arrangement manner in a specific training topic picture to be split, may be a part of a topic, or may be a complete topic, but may not include multiple topics, and certainly, the text region includes at least one text line.

The text area reference categories of each text area can be obtained by marking the text areas, so that the text line reference categories can be obtained through a computer and other equipment.

Specifically, the text line reference category of each text line may be obtained for different text lines in each different text region according to the text region reference category of each text region.

Therefore, the text line reference categories of each text region of the subject picture to be split can be quickly and accurately acquired through the acquired text region reference categories of each text region of the subject picture to be split, the difficulty and the workload of acquiring the text line reference categories are reduced, and the cost of training the subject splitting model is reduced.

To facilitate understanding of the foregoing solution, please refer to fig. 2-4 for a description with reference to a specific case, where fig. 2 is an exemplary diagram of a problem picture to be split trained obtained by a problem splitting model training method according to an embodiment of the disclosure; FIG. 3 is an exemplary diagram of labeling of a problem text region for training a problem picture to be split, obtained by the method for training a problem splitting model according to the embodiment of the present disclosure; FIG. 4 is an exemplary diagram of position category labeling of a question text line for training a to-be-split question picture, acquired by the question splitting model training method provided by the embodiment of the present disclosure.

As shown in fig. 2, the shown topic is an undisrupted english topic, and includes a topic, a footer, and so on, it should be noted that the topic splitting model training method provided in this disclosure does not limit the category of the topic, and the topics shown in fig. 2 are merely examples, and are not taken as a limitation of the topic type.

As shown in the figure, a picture of a subject to be split for training, namely the picture shown in fig. 2, is obtained first; then obtaining a text region reference category of each text region of the training topic picture to be split, namely labeling each text region in the picture shown in fig. 2 to obtain a text region reference category, as shown in fig. 3; and then obtaining the text line reference type of each text line in the text region based on the text region reference type, as shown in fig. 4, to obtain a picture labeled with the text line reference type.

In a specific implementation manner, with continuing to refer to fig. 2 to 4, in the title splitting model training method provided by the embodiment of the present disclosure, the text line reference category may include a question text line position category, and the text region reference category includes a question text region position category.

The problem text region position category can classify each text region of the subject part in the picture to be split according to the position, so that the obtained text line reference category is also the problem text line position category, namely, the text line of the subject part is classified according to the position, and the resolution of the subject can be conveniently realized based on the position of the text line subsequently.

Specifically, the question text region location categories may include: title start, title middle, title end, individual title, etc., specifically:

if the current topic is a complete independent topic, the current topic is labeled as an 'individual topic';

if the current topic is the beginning of a topic and there is another part in the lower page or lower column, it is labeled as "topic beginning";

if the current title is the end of a title and the upper page or the upper column has another part, marking as 'title end';

if the current topic is the middle part of a topic, it is labeled as "topic middle", where:

the "individual topic" can be represented by "Q-Single".

"title Start" can be represented by "Q-Start".

"End of title" can be represented by "Q-End".

"topic middle" can be represented by "Q-Mid".

In combination with the above case, as shown in fig. 3, the topics in the graph are labeled as "topic End" - - "Q-End" according to the location category of the question text region; "Individual topic" - - "Q-Single"; "title Start" - - "Q-Start".

It can be seen that the above setting of the problem text region position categories can completely include various categories of various problem text region positions, the problem text region categories of the current topic are divided in detail, and different problem text region categories of the current topic are effectively distinguished, so that the current topic region is distinguished from other topic regions, and accurate and complete classification of text regions is ensured.

Therefore, by acquiring the position category of the problem text region in the training topic picture to be split, the position category of the problem text line can be further acquired on the basis of the position category of the problem text region.

In a specific implementation manner, the question text line position category of the topic splitting model training method according to the embodiment of the present application may include: "topic first line," topic middle line, "" topic end line, "and" individual topic line.

The "first line of title" can be represented by "B-Question".

The "topic middle line" can be represented by "I-Question".

The "title end line" can be represented by "E-Question".

The "individual title line" can be represented by "S-Question".

The problem text line position categories are set in such a way, so that various categories of the problem text line positions can be comprehensively included, the text lines are accurately and comprehensively classified, and the labeling results of the problem text region position categories can be well utilized.

With continuing reference to fig. 3 and 4, the topics in fig. 3 are labeled as "topic End" - - "Q-End" according to the question text region location category; "Individual topic" - - "Q-Single"; after "topic Start" - "Q-Start", the picture is further processed, so as to obtain a question text line position category, as shown in fig. 4:

for the "Q-End" part:

"Q-End" indicates the title End region, and the first 5 rows are all the middle rows of the "Q-End" portion, so labeled as: "I-Question" indicates the middle row of the title.

Line 6, the last line of the "Q-End" section, so labeled: "E-Question" indicates the title end row.

For the "Q-Single" moiety:

"Q-Single" indicates an individual title region, and the first line of the "Q-Single" portion is labeled "B-Question" and indicates the first line of the title.

The last line of the "Q-Single" part is labeled as "E-Question", indicating the end of title line.

The middle part of the first line and the last line of the Q-Single part is marked as I-Question and represents the middle line of the title.

For the "Q-Start" portion:

"Q-Start" indicates the title Start area, and the first line of the "Q-Start" section is labeled as "B-Question" and indicates the first line of the title.

The second row of the "Q-Start" section is labeled "I-Question", indicating the middle row of the title.

Of course, if the current Question text region position category is "middle of topic" (Q-Mid) (not shown in the figure), the Question text line position category of each line of the current topic is "I-Question", which represents the middle line of the topic.

As can be seen from the above description, the marking method can process the topics of the cross-column or cross-page, so that the topics of the cross-column or cross-page are accurately marked by the position category of the text line of the question.

In the training of the topic picture to be split, the training method not only comprises a topic part, but also can comprise a title part, in order to improve the accuracy and comprehensiveness of the training of the topic splitting model, further improve the accuracy of the subsequent splitting of the topic by using the topic splitting model, and also can label the title part, so that the topic splitting model has the capability of splitting the title part.

The setting of the position type of the title text region and the position type of the title text line can classify each text region of the title part in the picture to be split according to the position of the text region, so that the trained title splitting model can also split the title part.

In a specific embodiment, the category of the location of the title text area may include: title start, title middle, title end, individual title.

If the current title is a full title, label "Individual title"; if the title is the beginning of a title and there is another part in the lower page or column, it is labeled "title begins"; if the current title is the end of a title and there is another part on its upper page or upper column, it is marked as "title end"; if the current title is the middle part of a title, it is labeled "middle of title", where:

the "individual title" may be represented by "T-Single".

"title Start" may be represented by "T-Start".

"title End" may be represented by "T-End".

"title middle" may be represented by "T-Mid".

Therefore, the setting of the position categories of the title text regions can comprehensively and specifically include various categories of the positions of the title text regions, and the accurate and comprehensive classification of the title text regions is ensured.

Specifically, the title text line position category may include: "title first line", "title middle line", "title end line", and "individual title line".

The "Title first line" can be represented by "B-Title".

The "Title middle line" can be represented by "I-Title".

The "Title end line" can be represented by "E-Title".

The "individual Title line" can be represented by "S-Title".

Therefore, the setting of the position categories of the title text lines can comprehensively and specifically comprise various categories of the positions of the title text lines, and the accurate and comprehensive classification of the title text lines is ensured.

Further, in another specific embodiment, since the training to-be-split title picture may further include a header and a footer, in order to further improve accuracy of model training, when obtaining the text line reference category, the text line reference category of the header and the footer portion may also be obtained, and correspondingly, if the text region reference category is obtained first, the text region reference category may further include: a header and a footer.

Headers and footers can be similarly classified, but because headers and footers do not span pages, headers and footers can be labeled as "separate headers" and "separate footers," i.e.:

the "Single header" may be represented by "H-Single".

The "individual footer" can be represented by "F-Single".

By acquiring the page headers and the page footers in the training picture of the subject to be split, the position category of the text lines of the page headers and the position category of the text lines of the page footers can be further acquired.

The header may include: "header first line", "header middle line", "header end line", and "single header line".

The "Header first line" can be represented by "B-Header".

The "Header middle line" may be represented by "I-Header".

The "Header end line" can be represented by "E-Header".

The "single Header row" may be denoted by "S-Header".

The footer may include: "footer first row", "footer middle row", "footer end row" and "individual footer row".

The "Footer first line" may be denoted by "B-Footer".

The "Footer middle line" may be denoted by "I-Footer".

The "Footer end line" may be denoted by "E-Foote".

The "individual Footer line" may be denoted by "S-Footer".

For the convenience of understanding, the following description is continued with reference to the foregoing case:

referring to fig. 5 and fig. 6, fig. 5 is an exemplary diagram of a reference category label of a text region of a topic picture to be split according to training obtained by a topic splitting model training method provided in the embodiment of the present disclosure; fig. 6 is an exemplary diagram of a reference category labeling of a text line of a training to-be-split topic picture acquired by the topic splitting model training method provided by the embodiment of the present disclosure.

As can be seen from fig. 5, the training to-be-split topic picture is labeled according to the question text region position category, the title text region position category, and the footer.

1. Question text region reference category for question text region

A "Q-End" part representing the title End region, a "Q-Single" part representing the individual title region, and a "Q-Start" part representing the title Start region.

2. Title text region reference category for title text region

Representing the "T-Single" part of the complete title.

3. Footer text region reference category for footer text region

Representing the "F-Single" portion of the full footer.

As shown in fig. 6, each line of the training topic picture to be split in the graph is further labeled according to the reference category of the text region labeled in fig. 5, so as to obtain the reference category of the text line:

1. question text line reference category for question text region:

the "Q-End" section, "Q-End", indicates the title End region, and the first 5 rows are all the middle rows of the "Q-End" section, so labeled: "I-Question" indicates the middle row of the title; line 6, the last line of the "Q-End" section, so labeled: "E-Question" indicates the title end row.

A "Q-Single" part, wherein the "Q-Single" represents an individual title area, and a first line of the "Q-Single" part is marked as "B-Question" and represents a first line of a title; the last line of the Q-Single part is marked as 'E-Question', and the last line represents the end line of the title; the middle part of the first line and the last line of the Q-Single part is marked as I-Question and represents the middle line of the title.

A part Q-Start, wherein the part Q-Start indicates the starting area of the title, and the first line of the part Q-Start is marked as 'B-Question' and indicates the first line of the title; the second row of the "Q-Start" section is labeled "I-Question", indicating the middle row of the title.

2. Title text line reference category of title text area:

in the first title text area, since the current title is a complete title, it is labeled "T-Single", and the first line of the first action "T-Single" is labeled: "B-Title" representing the first line of the Title; the last row of the second behavior "T-Single", so labeled: "E-Title" indicates the Title end line.

In the second title text area, since the current title is a complete title, it is labeled "T-Single", and there is only one line of titles in "T-Single", and therefore it is labeled: "S-Title" indicates an individual Title line.

3. Footer text line reference category for footer area:

since the current footer is a full footer, it is labeled "F-Single", and there is only one row of content in "F-Single", and therefore labeled: "S-Foote" refers to an individual Footer row.

As can be seen from the above description, the marking method can process the topics of the cross-column or cross-page, so that the topics of the cross-column or cross-page are accurately labeled by the reference categories of the text regions.

Of course, further, on the basis of the topic splitting model training method provided in the embodiment of the present disclosure, on the basis of performing the reference category labeling of the text region, the type of the reference category of the text region may be further refined, for example, on the basis of the position category of the question text region and the position category of the title text region, the labels of specific topic types such as more specific selection questions and filling-in-blank questions are added to obtain a more detailed reference category of the text region, and if the topic structure needs to be split, the more detailed categories may be divided, and meanwhile, the topic stem and the option are labeled in a certain relationship, for example: the method for training the topic splitting model provided by the embodiment of the disclosure does not make special limitation on the label for training the topic picture to be split, and all practicable labeling methods can be applied to the method for training the topic splitting model provided by the embodiment of the disclosure.

Obtaining a training to-be-split question picture marked with a text line reference category, and performing other processing on the training to-be-split question picture:

please continue to refer to fig. 1, step S11: according to the training picture of the subject to be split, obtaining the text line coordinate characteristics of the text line coordinates of the training picture of the subject to be split, the text line content characteristics of the text line contents of the training picture of the subject to be split and the text line picture characteristics corresponding to each text line.

After the training to-be-split theme picture is obtained, the text line coordinate characteristics, the text line content characteristics and the text line picture characteristics of each text line are obtained based on the picture.

It should be noted that the training to-be-split topic picture described herein refers to the picture itself, and does not cover the information of the text line reference category.

It is easy to understand that the text line coordinate feature is a feature used for indicating the location information of the text line, and specifically may be a feature representation of an upper left corner coordinate and a lower right corner coordinate of a frame where the text line is located, or a feature representation of one of the upper left corner coordinate and the lower right corner coordinate and a feature representation of a distance from the other in the X direction and the Y direction, the text line content feature refers to a feature representation of text line content of the corresponding text line, and the text line picture feature refers to all image features of the text line, including feature representations of text line font type, font size, font boldness, font italics, text line underlining, text line outline, text line color, and the like.

Specifically, the picture of the question to be split for training can be processed by using an OCR to obtain the coordinates of the text line of the question to be split for training and the content of the text line of the question to be split for training; and then, respectively characterizing the text line coordinates, the text line contents and the training subject picture to be split to obtain text line coordinate characteristics and text line content characteristics corresponding to each text line and text line picture characteristics corresponding to the whole training subject picture to be split, and then obtaining the text line picture characteristics corresponding to each text line according to the text line picture characteristics corresponding to the whole training subject picture to be split.

In a specific embodiment, in order to improve the training effect on the topic splitting model, the obtained text line coordinates may be normalized first, and then the text line coordinate features may be obtained.

For example, the coordinates of the top left corner of the text line are (X1, Y1) and the coordinates of the bottom right corner are (X2, Y2), normalized to within 1000, the normalized formula is:

wherein, width is the maximum line width coordinate of the picture, and height is the maximum height coordinate of all lines of the picture.

After the text line coordinates are normalized, the model training effect is better.

And then, carrying out characterization on the normalized coordinates to obtain the text line coordinate characteristics.

For the text line content, the text line content feature can be obtained by using the embedding module.

Step S12: and training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

After the text line coordinate characteristics of the text line coordinates of the training to-be-split question and the text line content characteristics and the text line picture characteristics of the text line content of the training to-be-split question are acquired based on the training to-be-split question picture, the question split model is trained by further combining with the text line reference category until the training requirement is met, and the question split model after training is obtained.

Specifically, please refer to fig. 7 for explaining the training process more clearly, and fig. 7 is a schematic diagram of the step of training the topic splitting model in the topic splitting model training method according to the embodiment of the present disclosure.

As shown in fig. 7, in a specific embodiment, the step of training the question splitting model according to the text line picture feature, the text line coordinate feature, the text line content feature, and the text line reference category to obtain the trained question splitting model may include:

step S70: and acquiring the text line prediction category of each text line of the training to-be-split question picture by utilizing a question splitting model according to the text line picture characteristics, the text line coordinate characteristics and the text line content characteristics.

After the text line picture characteristics, the text line coordinate characteristics and the text line content characteristics of the training to-be-split title picture are obtained, the text line picture characteristics of the training to-be-split title picture, the text line coordinate characteristics of each line of text in the picture, the text line content characteristics of each line of text and the text line reference categories corresponding to each text line are sent to a title splitting model, category prediction of the text lines is carried out, and the text line prediction categories are obtained.

In a specific embodiment, the topic splitting model can adopt a Layoutlm V2 model.

Of course, in order to improve the accuracy of the obtained text line prediction categories and improve the training efficiency of the model, in a specific embodiment, the step of obtaining the text line prediction categories of each text line of the training to-be-split question picture by using the question splitting model according to the text line picture features, the text line coordinate features, and the text line content features may include:

and acquiring the text line picture characteristics, the text line coordinate characteristics and the fusion characteristics of the text line content characteristics by using the question splitting model, and acquiring the text line prediction category of each text line of the training question picture to be split according to the fusion characteristics.

Specifically, the topic splitting model can achieve the acquisition of the fusion characteristics by adding the text line picture characteristics, the text line coordinate characteristics and the text line content characteristics.

And after the fusion characteristics are obtained, the text line prediction category corresponding to the text line is obtained based on the fusion characteristics.

Therefore, the text line prediction categories are acquired by acquiring the fusion characteristics, and the accuracy of information used in model training can be improved, so that the accuracy of the acquired text line prediction categories can be improved, and the training efficiency of the model is improved.

Step S71: and obtaining the related loss according to the text line prediction category and the text line reference category.

After the text line prediction category of each text line of the training to-be-split title picture is obtained, in order to realize the training of the title splitting model, the related loss is further obtained according to the text line prediction category and the text line reference category.

Step S72: it is determined whether the correlation loss satisfies a predetermined loss threshold, and if so, the step S74 is performed, and if not, the step S73 is performed.

Step S73: and adjusting the parameters of the topic splitting model.

And if the correlation loss does not meet the preset loss threshold, adjusting parameters of the topic splitting model, then obtaining the correlation loss according to the text line prediction category and the text line reference category until the correlation loss meets the preset loss threshold in the step S72, and executing the step S74.

Step S74: and obtaining the trained question splitting model.

After the steps, the trained question splitting model is obtained.

In summary, in the title splitting model training step provided in the embodiment of the present disclosure, firstly, according to the text line picture features, the text line coordinate features, and the text line content features, a title splitting model is used to obtain text line prediction categories of each text line of a title picture to be split for training, so that the obtained text line prediction categories are more detailed and accurate, on this basis, related losses are obtained according to the obtained text line prediction categories and the text line reference categories, and a predetermined loss threshold is set for the related losses, when the related losses do not satisfy the predetermined loss threshold, parameters of the title splitting model are adjusted, until the related losses satisfy the predetermined loss threshold, a trained title splitting model is obtained, so that the obtained trained title splitting model can satisfy the quality requirements of the text line reference categories, the topics in the topic picture to be split can be split and trained at high quality according to the reference categories of the text lines.

Please refer to fig. 8, fig. 8 is an example diagram of topics split by using the topic splitting model training method provided in the embodiment of the present disclosure, as can be seen from the figure, since the topic splitting model training method provided in the embodiment of the present disclosure uses the text line picture feature, the text line coordinate feature, and the text line content feature of the training topic picture to be split, and the image is combined with the coordinates and the content, the topic splitting result is directly refined to a single line, the content of each topic can be directly obtained without performing subsequent processing, and the obtained splitting result is more accurate.

On the basis of the topic splitting model training method provided by the embodiment of the disclosure, the embodiment of the disclosure further provides a topic splitting method, which includes:

obtaining a predicted topic picture to be split;

the question splitting model obtained by training the question splitting model training method provided by the embodiment of the disclosure is utilized to split the question of the predicted to-be-split picture, so as to obtain the text line splitting classification of each text line.

It is easy to understand that the acquired predicted topic picture to be split can be any type of topic picture such as Chinese, mathematics, English and the like; the topic splitting model described herein includes the OCR, the characterization module, and the entirety of the model formed by all of the models that obtain the predicted line of text categories, as previously described.

After the topic picture to be predicted and split is obtained, the topic splitting model obtained by training by using the topic splitting model training method provided by the embodiment of the disclosure is utilized to perform topic splitting on the topic picture to be predicted and split, so that the text line splitting classification of each text line can be obtained.

In this way, the question splitting method provided by the embodiment of the present disclosure obtains the text line splitting category of each text line of the predicted question picture to be split according to the text line picture feature, the text line coordinate feature, and the text line content feature, and can ensure the accuracy of the obtained text line splitting category by using the information included in the text line picture feature, the text line content feature, and the text line coordinate feature, thereby improving the accuracy of the question splitting.

Certainly, in order to split the topics, after a text line splitting category, that is, a text line reference category output by the topic splitting model, is obtained, then according to the text line splitting category, each text line of the predicted topic picture to be split is combined and connected, and each split topic is obtained.

After the text line splitting category, namely the text line reference category output by the Question splitting model, is obtained, according to the classification labeling method of the text line reference category described above, the non-Question area is omitted, and from top to bottom, the first "B-xxx" (for example, "B-Question") category is encountered and the next most adjacent "E-xxx" (for example, "E-Question") is combined to obtain a complete Question, if the current page categories are all "I-xxx" (for example, "I-Question"), the top is found to be combined with the next most adjacent "B-xxx" (for example, "B-Question") and the next most adjacent "E-xxx" (for example, "E-Question") to obtain a complete Question, this enables cross-page binding.

Therefore, by using the topic splitting method provided by the embodiment of the disclosure, the split topic rows can be connected, so that a complete split topic is obtained, and the connection process is very simple and convenient and the connection result is more accurate no matter whether the split topic rows are connected across pages or not.

In summary, by using the topic splitting method provided by the embodiment of the present disclosure, the topic splitting model provided by the embodiment of the present disclosure can be conveniently used to perform detailed splitting to a text line on a predicted topic picture to be split, and obtain an accurate text line splitting category of each text line, the process is simple and easy to operate, and no subsequent processing operation is needed, after the text line splitting category is obtained, the topic can be connected according to the text line splitting category of each text line, so as to obtain a complete split topic, so that the connection of the topic across pages can be realized, and no matter whether the topic is connected across pages, the connection process is very simple and convenient, and the connection result is more accurate.

In the following, the topic splitting model training device provided by the embodiment of the present disclosure is introduced, and the topic splitting model training device described below may be considered as a functional module architecture that is required to be set by an electronic device (e.g., a PC) to respectively implement the topic splitting model training method provided by the embodiment of the present disclosure. The contents of the topic splitting model training device described below can be referred to in correspondence with the contents of the topic splitting model training method described above, respectively.

Fig. 9 is a block diagram of a topic splitting model training device provided in an embodiment of the present disclosure, where the topic splitting model training device may be applied to a client or a server, and referring to fig. 9, the topic splitting model training device may include:

the training to-be-split topic picture obtaining unit 901 is adapted to obtain a training to-be-split topic picture, where each text line of the training to-be-split topic picture is labeled with a text line reference category.

The feature obtaining unit 902 is adapted to obtain, according to the training picture of the topic to be split, a text line coordinate feature of a text line coordinate of the training topic to be split, a text line content feature of a text line content of the training topic to be split, and a text line picture feature corresponding to each text line.

The question splitting model training unit 903 is adapted to train the question splitting model according to the text line picture features, the text line coordinate features, the text line content features, and the text line reference categories, so as to obtain the trained question splitting model.

It can be seen that, in the title splitting model training device provided by the embodiment of the present disclosure, when a title splitting model is trained, first, the to-be-split title picture obtaining unit 901 is trained to obtain a to-be-split title picture to be trained, where each text line of the to-be-split title picture to be trained is labeled with a text line reference category; then, a feature obtaining unit 902 obtains, according to the picture of the subject to be split, a text line coordinate feature of a text line coordinate of the subject to be split, a text line content feature of a text line content of the subject to be split, and a text line picture feature corresponding to each text line; and finally, a question splitting model training unit 903 is used for training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

It can be seen that, according to the title splitting model training device provided by the embodiment of the present disclosure, the title splitting model is trained according to the text line picture feature, the text line coordinate feature, the text line content feature and the text line reference category of the trained title picture to be split, in the process of model training, the prediction of the category of each text line of the trained title picture to be split is realized by adopting a mode of combining the text line picture feature, the text line content feature and the text line coordinate feature, the accuracy of the predicted category of the text line and the content of the text line can be ensured by using the information contained in the text line picture feature, the text line content feature and the text line coordinate feature, and the accuracy of model training can be further ensured, thereby further reducing the inaccuracy of obtaining the specific content in the trained title picture to be split, the influence on the accuracy of the splitting of the questions is caused, and the accuracy of the splitting of the questions is improved.

In order to train the topic splitting model, the to-be-split topic picture acquisition unit 901 is trained: and acquiring a training topic picture to be split, wherein each text line of the training topic picture to be split is marked with a text line reference category.

a text region reference category acquiring unit, adapted to acquire a text region reference category of each text region of the training topic picture to be split, where the text region includes at least one text line;

a text line reference type obtaining unit, adapted to obtain the text line reference type of each text line in the text region according to each text region reference type of each text region.

In a specific implementation manner, with continuing to refer to fig. 2 to 4, in the title splitting model training method provided in the embodiment of the present disclosure, the text line reference category acquired by the text line reference category acquiring unit may include a question text line position category, and the text region reference category acquired by the text region reference category acquiring unit includes a question text region position category.

the "individual topic" can be represented by "Q-Single".

"title Start" can be represented by "Q-Start".

"End of title" can be represented by "Q-End".

"topic middle" can be represented by "Q-Mid".

The "first line of title" can be represented by "B-Question".

The "topic middle line" can be represented by "I-Question".

The "title end line" can be represented by "E-Question".

The "individual title line" can be represented by "S-Question".

for the "Q-End" part:

For the "Q-Single" moiety:

For the "Q-Start" portion:

In the training of the topic picture to be split, the topic picture to be split not only comprises a topic part, but also can comprise a title part, in order to improve the accuracy and comprehensiveness of the training of the topic splitting model, and further improve the accuracy of splitting the topic by using the topic splitting model subsequently, the title part can also be labeled, so that the topic splitting model has the capability of splitting the title part, in another specific embodiment, the text row reference category obtained by the text row reference category obtaining unit in the training method of the topic splitting model provided by the embodiment of the application can also comprise a title text row position category, and the corresponding text region reference category obtained by the text region reference category obtaining unit can also comprise a title text region position category.

the "individual title" may be represented by "T-Single".

"title Start" may be represented by "T-Start".

"title End" may be represented by "T-End".

"title middle" may be represented by "T-Mid".

The "Title first line" can be represented by "B-Title".

The "Title middle line" can be represented by "I-Title".

The "Title end line" can be represented by "E-Title".

The "individual Title line" can be represented by "S-Title".

Further, in another specific embodiment, since the training to-be-split title picture may further include a header and a footer, in order to further improve accuracy of model training, when obtaining the text line reference category, the text line reference category of the header and the footer portion may also be obtained, and correspondingly, if the text region reference category is obtained by first obtaining the text region reference category, the text region reference category obtained by the text region reference category obtaining unit may further include: a header and a footer.

the "Single header" may be represented by "H-Single".

The "individual footer" can be represented by "F-Single".

The "Header first line" can be represented by "B-Header".

The "Header middle line" may be represented by "I-Header".

The "Header end line" can be represented by "E-Header".

The "single Header row" may be denoted by "S-Header".

The "Footer first line" may be denoted by "B-Footer".

The "Footer middle line" may be denoted by "I-Footer".

The "Footer end line" may be denoted by "E-Foote".

The "individual Footer line" may be denoted by "S-Footer".

1. Question text region reference category of question text region.

2. The title text region reference category of the title text region.

Representing the "T-Single" part of the complete title.

3. A footer text region reference category for the footer text region.

Representing the "F-Single" portion of the full footer.

1. question text line reference category for question text region:

2. Title text line reference category of title text area:

3. Footer text line reference category for footer area:

feature acquisition unit 902: according to the training picture of the subject to be split, obtaining the text line coordinate characteristics of the text line coordinates of the training picture of the subject to be split, the text line content characteristics of the text line contents of the training picture of the subject to be split and the text line picture characteristics corresponding to each text line.

Topic splitting model training unit 903: and training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

step S70: the text line prediction category obtaining unit may obtain, according to the text line picture feature, the text line coordinate feature, and the text line content feature, a text line prediction category of each text line of the training topic picture to be split by using a topic splitting model.

the fusion feature obtaining unit may obtain the text line picture features, the text line coordinate features, and the fusion features of the text line content features by using the topic splitting model, and obtain the text line prediction categories of each text line of the training topic picture to be split according to the fusion features.

Step S71: and the question splitting model training unit can acquire the related loss according to the text line prediction category and the text line reference category.

Step S72: the topic splitting model training unit can also judge whether the related loss meets a predetermined loss threshold, if so, execute step S74, and if not, execute step S73.

Step S73: the question splitting model training unit can also adjust the parameters of the question splitting model.

Step S74: through the steps, the trained question splitting model can be obtained through the question splitting model training unit.

After the steps, the trained question splitting model is obtained.

On this basis, this disclosed embodiment also provides a topic splitting device, and the topic splitting device includes:

and the predicted topic picture to be split obtaining unit is suitable for obtaining the predicted topic picture to be split.

The topic splitting unit is adapted to split the topic of the predicted topic picture to be split by using the topic splitting model obtained by training the topic splitting model training method according to any one of the foregoing embodiments, so as to obtain the text line splitting category of each text line.

In a specific embodiment, the topic splitting apparatus further includes:

and the title connecting unit is suitable for combining and connecting each text line of the predicted title picture to be split according to the text line splitting category to obtain each split title.

In order to realize the theme splitting, a unit for obtaining the theme picture to be split is predicted, and the theme picture to be split can be obtained; the topic splitting unit can use the topic splitting model obtained by training the topic splitting model training method provided by the embodiment of the disclosure to split the topic of the predicted to-be-split picture, so as to obtain the text line splitting classification of each text line.

Certainly, in order to split the topics, after the text line splitting category, that is, the text line reference category output by the topic splitting model, is obtained, the topic connecting unit may combine and connect each text line of the predicted topic picture to be split according to the text line splitting category to obtain each split topic.

An exemplary embodiment of the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor, the computer program, when executed by the at least one processor, is for causing the electronic device to perform a method according to an embodiment of the disclosure.

The disclosed exemplary embodiments also provide a non-transitory computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is adapted to cause the computer to perform a method according to an embodiment of the present disclosure.

When training a topic splitting model, first obtaining a topic picture to be split, where each text line of the topic picture to be split is marked with a text line reference category; then according to the picture of the subject to be split, acquiring the text line coordinate characteristics of the text line coordinates of the subject to be split, the text line content characteristics of the text line contents of the subject to be split and the text line picture characteristics corresponding to each text line; and finally, training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model. It can be seen that the title splitting model training method provided by the embodiment of the present disclosure trains the title splitting model according to the text line picture feature, the text line coordinate feature, the text line content feature, and the text line reference category of the training title picture to be split, and in the process of model training, the prediction of the category of each text line of the training title picture to be split is realized by adopting a manner of combining the text line picture feature, the text line content feature, and the text line coordinate feature, and the accuracy of the category of the predicted text line and the content of the text line can be ensured by using the information contained in the text line picture feature, the text line content feature, and the text line coordinate feature, so as to ensure the accuracy of model training, thereby further reducing the influence on the accuracy of title splitting caused by inaccurate acquisition of specific content in the training title picture to be split, the accuracy of the question splitting is improved.

Referring to fig. 10, a block diagram of a structure of an electronic device 800, which may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 10, the electronic device 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the electronic device 800 are connected to the I/O interface 805, including: an input unit 806, an output unit 807, a storage unit 808, and a communication unit 809. The input unit 806 may be any type of device capable of inputting information to the electronic device 800, and the input unit 806 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. Output unit 807 can be any type of device capable of presenting information and can include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 804 may include, but is not limited to, a magnetic disk, an optical disk. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, WiFi devices, WiMax devices, cellular communication devices, and/or the like.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above. For example, in some embodiments, methods S10-S12 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto the electronic device 800 via the ROM 802 and/or the communication unit 809. In some embodiments, the computing unit 801 may be configured to perform the methods S10-S12 by any other suitable means (e.g., by way of firmware).

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As used in this disclosure, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Although the disclosed embodiments are disclosed above, the disclosure is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the disclosure, and it is intended that the scope of the disclosure be limited only by the claims appended hereto.

Claims

1. A method for training a question splitting model is characterized by comprising the following steps:

the step of obtaining the text line reference category comprises the following steps:

obtaining a text region reference category of each text region of the training topic picture to be split, wherein the text region comprises at least one text line;

acquiring the text line reference category of each text line in the text region according to each text region reference category of each text region;

2. The topic splitting model training method of claim 1, wherein the text line reference category comprises a question text line position category and the text region reference category comprises a question text region position category.

3. The topic splitting model training method of claim 2, wherein the question text line position categories comprise: title first line, title middle line, title end line, and individual title line.

4. The topic splitting model training method of claim 2, wherein the question text region location categories comprise: topic start, topic middle, topic end, individual topic.

5. The topic splitting model training method of claim 2, wherein the text line reference category further comprises a topic text line position category, and the text region reference category further comprises a topic text region position category.

6. The topic splitting model training method of claim 5, wherein the headline text line position category comprises: title first line, title middle line, title end line, and individual title line.

7. The title splitting model training method of claim 5, wherein the title text region position category comprises: title start, title middle, title end, individual title.

8. The topic splitting model training method of claim 2, wherein the text region reference category further comprises: a header and a footer.

9. The title splitting model training method according to any one of claims 1 to 8, wherein the step of training the title splitting model according to the text line picture features, the text line coordinate features, the text line content features and the text line reference categories to obtain the trained title splitting model comprises:

acquiring a text line prediction category of each text line of the training subject picture to be split by utilizing a subject splitting model according to the text line picture characteristic, the text line coordinate characteristic and the text line content characteristic;

and adjusting parameters of the question splitting model according to the loss of the text line prediction category and the text line reference category until the loss meets a loss threshold value, so as to obtain the trained question splitting model.

10. The title splitting model training method according to claim 9, wherein the step of obtaining the text line prediction category of each text line of the title picture to be split by using a title splitting model according to the text line picture features, the text line coordinate features, and the text line content features comprises:

11. A title splitting method is characterized by comprising the following steps:

obtaining a predicted topic picture to be split;

the topic splitting model trained by the topic splitting model training method according to any one of claims 1 to 10 is used to perform topic splitting on the predicted topic picture to be split, so as to obtain a text line splitting classification of each text line.

12. The title splitting method of claim 11, further comprising:

and according to the text line splitting categories, combining and connecting the text lines of the predicted topic picture to be split to obtain each topic after splitting.

13. A topic splitting model training device is characterized by comprising:

the training to-be-split question picture acquiring unit is suitable for acquiring a training to-be-split question picture, and text line reference categories are marked on all text lines of the training to-be-split question picture;

the step of acquiring the text line reference category may include:

a text line reference type obtaining unit, adapted to obtain the text line reference type of each text line in the text region according to each text region reference type of each text region;

the characteristic obtaining unit is suitable for obtaining the text line coordinate characteristics of the text line coordinates of the training to-be-split question, the text line content characteristics of the text line contents of the training to-be-split question and the text line picture characteristics corresponding to each text line according to the training to-be-split question picture;

and the question splitting model training unit is suitable for training the question splitting model according to the text line picture characteristics, the text line coordinate characteristics, the text line content characteristics and the text line reference categories to obtain the trained question splitting model.

14. A title splitting device, comprising:

the device comprises a prediction to-be-split theme picture obtaining unit, a prediction to-be-split theme picture obtaining unit and a prediction to-be-split theme picture obtaining unit, wherein the prediction to-be-split theme picture obtaining unit is suitable for obtaining a prediction to-be-split theme picture;

the topic splitting unit is adapted to split the topic of the predicted topic picture to be split by using the topic splitting model obtained by training the topic splitting model training method according to any one of claims 1 to 10, so as to obtain the text line splitting classification of each text line.

15. The title splitting device of claim 14, further comprising:

16. An electronic device, comprising:

a processor, and a memory storing a program, wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the topic splitting model training method according to any one of claims 1-10 or the topic splitting method according to any one of claims 11-12.

17. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the topic splitting model training method according to any one of claims 1-10 or the topic splitting method according to any one of claims 11-12.