CN112667771A

CN112667771A - Answer sequence determination method and device

Info

Publication number: CN112667771A
Application number: CN202011529776.4A
Authority: CN
Inventors: 王德勋; 徐国强
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2021-04-16
Also published as: WO2022134578A1

Abstract

The invention discloses a method and a device for determining an answer sequence, relates to the technical field of intelligent decision, and mainly aims to solve the problem of low accuracy caused by the fact that an answer text box sequence contains irrelevant characters. The method comprises the following steps: acquiring a text box sequence, and storing the text box sequence to a root node S of a binary tree storage structure₀Performing the following steps; clustering the text box sequence to obtain a first subsequence and a second subsequence, and detecting whether an end point subsequence exists in the first subsequence and the second subsequenceColumns; if not, backtracking the binary tree storage structure to obtain an answer sequence; if so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (3), saving the non-endpoint subsequence to the root node S₀Right child node S of₂Middle, left child node S₁And repeatedly clustering and detecting the terminal subsequence in the binary tree until no terminal subsequence exists, backtracking the binary tree storage structure, and obtaining and outputting an answer sequence.

Description

Answer sequence determination method and device

Technical Field

The invention relates to the field of intelligent decision making, in particular to a method and a device for determining an answer sequence.

Background

Visual Question Answering (VQA) is a research object combining multiple fields of text detection, text recognition, NLP reading understanding, and the process may generally include: an Optical Character Recognition (OCR) system detects and recognizes all text regions in a scanned document, sorts all text boxes in the order of positions from left to right and from top to bottom, and outputs answers to questions through models.

Currently, it is common to output the starting and ending positions using a pre-trained reading understanding model and use the text sequence between the two positions as the answer to the question. However, the scanned document structure and layout under the real scene are very complicated, so that the output question answers easily contain irrelevant characters, and the accuracy is low.

Disclosure of Invention

In view of this, the present invention provides a method and an apparatus for determining an answer sequence, and mainly aims to solve the problem that the output answers to a question easily contain irrelevant characters and have a low accuracy due to a complex structure and layout of a scanned document in a real scene.

According to an aspect of the present invention, there is provided a method for determining a sequence of answers, including:

acquiring a text box sequence, and storing the text box sequence to a root node S of a binary tree storage structure₀Performing the following steps;

clustering the text box sequence to obtain a first subsequence and a second subsequence;

detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence simultaneously comprising a first text box and a second text box;

if not, backtracking and merging the binary tree storage structure to obtain an answer sequence;

if so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁Repeatedly executing clustering processing and detecting steps by the terminal subsequence in the binary tree until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence;

and outputting the answer sequence.

Further, the pair of the left child nodes S₂Before the clustering and detecting steps are repeatedly performed by the endpoint subsequence in (1), the method further comprises:

for the right child node S₂The non-endpoint subsequence in (1) is clustered to obtain a third subsequence and a fourth subsequence,

respectively calculating the third subsequence, the fourth subsequence and the left child node S₁The minimum horizontal distance between the end point subsequences in (a);

if the minimum horizontal distance is not larger than a preset distance threshold, merging and storing the corresponding third subsequence or fourth subsequence to the left child node S₁In (1).

Further, the performing backtracking merging processing on the binary tree storage structure includes:

according to the left child node S at the bottommost layer of the binary tree storage structure_2i+1Find the corresponding parent node S_i+1；

Calculating the parent node S_i+1With said parent node S_i+1Of brother node S_i+2A minimum horizontal distance therebetween;

judging the father node S_i+1With said parent node S_i+1Of brother node S_i+2Whether the minimum horizontal distance between the two is not greater than a preset distance threshold value;

if not, stopping backtracking and enabling the left child node S_2i+1The subsequence in (1) is determined as the answer sequence;

if yes, backtracking is continuously carried out on the upper nodes of the binary tree storage structure.

Further, the calculation of the parent node S_i+1Sibling node S with the parent node_i+2A minimum horizontal distance therebetween, comprising:

obtaining the father node S_i+1Minimum and maximum x-coordinates (a1, a 2);

obtaining brother node S of the father node_i+2Minimum and maximum x-coordinates (B1, B2);

calculating the father node S according to a preset minimum horizontal distance formula_i+1Sibling node S with the parent node_i+2A minimum horizontal distance therebetween, the minimum horizontal distance formula comprising:

D＝max(A2,B2)-min(A1,B1)-(B2-B1)-(A2-A1)

wherein D is the father node S_i+1Sibling node S with the parent node_i+2A1 is the parent node S_i+1Is the parent node S, A2_i+1B1 is the parent node S_i+1Of brother node S_i+2B2 is the parent node S_i+1Of brother node S_i+2The maximum x coordinate of (c).

Further, the clustering the text box sequence to obtain a first subsequence and a second subsequence includes:

and performing k-means clustering processing on the text box sequence to obtain a first subsequence and a second subsequence.

Further, performing k-means clustering processing on the text box sequence to obtain a first subsequence and a second subsequence, including;

randomly extracting 2 text boxes in the text box sequence as a first centroid and a second centroid;

respectively calculating Euclidean distances between the rest text boxes in the text box sequence and the first centroid and the second centroid;

and dividing the text boxes with the Euclidean distance from the first centroid larger than that from the second centroid into a first subsequence, and dividing the text boxes with the Euclidean distance from the second centroid larger than that from the first centroid into a second subsequence.

Further, the text box sequence is obtained and stored to a root node S of a binary tree storage structure₀Before, the method further comprises:

detecting and identifying the obtained scanned document by using an optical character recognition system to obtain a text box cluster;

arranging the text box clusters according to a preset sequence;

processing the arranged text box cluster by using a pre-trained reading understanding model to obtain a first text box and a second text box;

and determining a text box cluster between the first text box and the second text box as an output text box sequence.

According to another aspect of the present invention, there is provided an answer sequence determination apparatus including:

an obtaining unit, configured to obtain a text box sequence, and store the text box sequence to a root node S of a binary tree storage structure₀Performing the following steps;

the processing unit is used for clustering the text box sequence to obtain a first subsequence and a second subsequence, and detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence which simultaneously comprises the first text box and the second text box;

the backtracking unit is used for backtracking and merging the binary tree storage structure if the binary tree storage structure is not the binary tree storage structure, so as to obtain an answer sequence;

a merging unit, configured to save the endpoint subsequence to the root node S if yes₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁The end point subsequence in (1) repeatedly performs clustering and detectionA measuring step, namely backtracking and merging the binary tree storage structure until no terminal subsequence exists to obtain an answer sequence;

and the output unit is used for outputting the answer sequence.

Further, the apparatus further comprises: a calculating unit, a judging unit,

the processing unit is specifically further configured to assign the right child node S₂The non-endpoint subsequence in (1) is clustered to obtain a third subsequence and a fourth subsequence,

the calculating unit is further specifically configured to calculate the third subsequence, the fourth subsequence, and the left child node S respectively₁The minimum horizontal distance between the end point subsequences in (a);

the judging unit is configured to merge and store the corresponding third subsequence or fourth subsequence to the left child node S if the minimum horizontal distance is not greater than a preset distance threshold₁In (1).

Further, the backtracking unit includes:

a searching module for searching the left child node S at the bottom layer of the binary tree storage structure_2i+1Find the corresponding parent node S_i+1；

A first calculation module for calculating the father node S_i+1With said parent node S_i+1Of brother node S_i+2A minimum horizontal distance therebetween;

a judging module for judging the father node S_i+1With said parent node S_i+1Of brother node S_i+2Whether the minimum horizontal distance between the two is not greater than a preset distance threshold value;

a determining module, configured to stop backtracking and determine the left child node S if the left child node S is not the left child node S_2i+1The subsequence in (1) is determined as the answer sequence; if yes, backtracking is continuously carried out on the upper nodes of the binary tree storage structure.

Further, the first computing module is specifically configured to obtain the parent node S_i+1Minimum and maximum x-coordinates (a1, a 2); obtaining brother node S of the father node_i+2Minimum and maximum x-coordinates (B1, B2); calculating the father node S according to a preset minimum horizontal distance formula_i+1Sibling node S with the parent node_i+2A minimum horizontal distance therebetween, the minimum horizontal distance formula comprising:

D＝max(A2,B2)-min(A1,B1)-(B2-B1)-(A2-A1)

Further, the processing unit is specifically configured to perform k-means clustering on the text box sequence to obtain a first subsequence and a second subsequence.

Further, the processing unit comprises;

the extraction module is used for randomly extracting 2 text boxes in the text box sequence to serve as a first centroid and a second centroid;

the second calculation module is used for calculating Euclidean distances between the rest text boxes in the text box sequence and the first centroid and the second centroid respectively;

and the dividing module is used for dividing the text boxes with the Euclidean distance from the first centroid to the second centroid to a first subsequence, and dividing the text boxes with the Euclidean distance from the second centroid to the second subsequence.

Further, the apparatus further comprises:

the recognition unit is used for detecting and recognizing the obtained scanning document by using an optical character recognition system to obtain a text box cluster;

the arranging unit is used for arranging the text box clusters according to a preset sequence;

the training unit is used for processing the arranged text box cluster by utilizing a pre-trained reading understanding model to obtain a first text box and a second text box;

and the determining unit is used for determining the text box cluster between the first text box and the second text box as the output text box sequence.

According to still another aspect of the present invention, there is provided a storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the determination method of the answer sequence as described above.

According to still another aspect of the present invention, there is provided a computer apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the determination method of the answer sequence.

Compared with the prior art, the method and the device for determining the answer sequence have the advantages that the text box sequence is obtained and stored to the root node S of the binary tree storage structure₀Performing the following steps; clustering the text box sequence to obtain a first subsequence and a second subsequence; detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence simultaneously comprising a first text box and a second text box; if not, backtracking and merging the binary tree storage structure to obtain an answer sequence; if so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁Repeatedly executing clustering processing and detecting steps by the terminal subsequence in the binary tree until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence; and outputting the answer sequence. Thereby to obtainThe method can automatically delete irrelevant answers in the answer sequence, and improve the accuracy of the output answer sequence.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow chart of a method for determining answer sequences according to an embodiment of the present invention;

FIG. 2 is a block diagram of an answer sequence determining apparatus according to an embodiment of the present invention;

fig. 3 shows a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

An embodiment of the present invention provides a method for determining an answer sequence, as shown in fig. 1, where the method includes:

101. acquiring a text box sequence, and storing the text box sequence to a root node S of a binary tree storage structure₀In (1).

Wherein, the application environment of the invention can be under the visual question answering technology,the method includes the steps of acquiring text box data, wherein a Visual Question Answering (VQA) is a new field needing to understand texts and vision, and automatically analyzing relevant Question Answering answers about images, such as what is in the images, by using a deep learning model according to input image information through a Visual Question Answering system? What movement is in progress? Who is kicking the ball? How many players are in the image? Who are participants? Is it raining? And the problem that the answer that 11 players play the ball, the Buck, the Langzo, the Boll, the Krie, the Thompson, the Deck, the Novogue, the Mark, the Gaussel, the Kaiwen, the Leff and the weather are raining can be analyzed, and the data obtained by analysis is determined as the obtained text box sequence. For the embodiment of the present invention, after the text box sequence is obtained, the text box sequence may be saved to the root node S of the binary tree storage structure₀Thereby obtaining a binary tree storage structure storing the current text box sequence.

102. And clustering the text box sequence to obtain a first subsequence and a second subsequence, and detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence.

The clustering processing is a method for automatically dividing a pile of data without labels into several classes, belongs to an unsupervised learning method, and ensures that the data of the same class have similar characteristics. Specifically, the first subsequence and the second subsequence may include two subsequences into which the initial text box is split, and the sum of text box clusters included in the two subsequences is all text boxes in the initial text box sequence.

In addition, the endpoint subsequence is a subsequence including both a first text box and a second text box, the first text box may be a starting text box in the initial text box sequence, the second text box may be an ending text box in the initial text box, for example, after reading the understanding model, a piece of answer information is output as { I, am, a, boy }, then { I } may be the starting text box, and { boy } may be the ending text box. In the process of acquiring the text box sequence, the reading understanding model may directly output the order information of the starting text box and the ending text box, and may search whether to include the first text box and the second text box simultaneously by traversing all the text boxes in the first subsequence and all the text boxes in the second subsequence.

103. If not, backtracking and merging the binary tree storage structure to obtain an answer sequence.

For the embodiment of the present invention, the binary tree storage structure may be subjected to backtracking and merging processing by calculating the minimum horizontal distance between nodes, so as to determine whether the current node is a final result node. The specific backtracking process may include: a. and detecting whether the minimum horizontal distance of the father node of the current node is not more than a preset distance threshold value or not by taking the node at the bottommost layer as the current node. b. If yes, taking the father node of the current node as the current node, executing the step a, and determining the text box sequence stored in the current node as an answer sequence when the minimum horizontal distance is larger than a preset distance threshold value.

104. If so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁And repeatedly executing the clustering processing and detecting steps by the terminal subsequence until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence.

In the embodiment of the invention, the default is to use the left child node of the binary tree storage structure as the storage space of the final output answer and use the right child node of the binary tree storage structure as the storage space of the abandoned child sequence, and in an actual application scene, the left child node can be set according to habits and business requirements, and the invention is not specified specifically. Specifically, the left child node of the root node may be denoted as S₁The right child node of the root node may be denoted as S₂. Specifically, when the endpoint subsequence exists in the first subsequence and the second subsequence, the endpoint subsequence is continuously clustered, whether the two new subsequences obtained by clustering include the endpoint subsequence is detected, if yes, the endpoint subsequence is continuously clustered, and the process is repeated until the endpoint subsequence does not exist. And when the end point subsequence does not exist, performing backtracking and merging processing on the binary tree storage structure to obtain an answer sequence, wherein the specific backtracking process is the same as the step 104, and is not described herein again.

105. And outputting the answer sequence.

Specifically, after the answer sequence is obtained, the answer sequence may be output, and in an actual application scenario, the answer sequence may be displayed on a display screen, so that an actual problem can be solved by using the answer sequence.

In the embodiment of the invention, the text box sequence is obtained and stored to a root node S of a binary tree storage structure₀Before, the method further comprises: detecting and identifying the obtained scanned document by using an optical character recognition system to obtain a text box cluster; arranging the text box clusters according to a preset sequence; processing the arranged text box cluster by using a pre-trained reading understanding model to obtain a first text box and a second text box; and determining a text box cluster between the first text box and the second text box as an output text box sequence.

In the embodiment of the invention, the electronic version document can be obtained in a scanning mode, so that the scanning document is detected and identified by using the optical character recognition system to obtain the text box cluster, and the text box cluster can be a data set of a series of text boxes obtained after the detection and identification by using the optical character recognition system. For example, each text box of the text box clusters { I, am, a, boy }, has a position parameter, for example, the position parameter of { I } may be 1, and the position parameter of { am } may be 2, according to which the text boxes in the text box clusters may be arranged. Inputting the text box cluster after arrangement processing into a pre-trained reading understanding model to obtain a first text box and a second text box, namely a starting text box and an ending text box of the answer sequence, wherein the text box cluster between the starting text box and the ending text box can be determined as a text box sequence to be output.

For further limitation and description, the clustering the text box sequence to obtain a first subsequence and a second subsequence includes: and performing k-means clustering processing on the text box sequence to obtain a first subsequence and a second subsequence.

The specific process may include: randomly extracting 2 text boxes in the text box sequence as a first centroid and a second centroid; respectively calculating Euclidean distances between the rest text boxes in the text box sequence and the first centroid and the second centroid; and dividing the text boxes with the Euclidean distance from the first centroid larger than that from the second centroid into a first subsequence, and dividing the text boxes with the Euclidean distance from the second centroid larger than that from the first centroid into a second subsequence.

The euclidean distance, also known as the euclidean distance, is the most common distance metric, measuring the absolute distance between two points in a multidimensional space, i.e. the true distance between two points in an m-dimensional space, or the natural length of a vector. The euclidean distance in two and three dimensions is the actual distance between two points. The specific calculation formula is as follows:

wherein xi and yi respectively represent the horizontal and vertical coordinates of the vector.

According to the embodiment of the invention, the text box sequence is clustered into two subsequences by carrying out k-means clustering processing on the text box sequence, so that the useful text box sequence and the waste text box sequence are respectively stored by utilizing the two subsequences subsequently, thereby deleting irrelevant answers and improving the accuracy of the output answer sequence.

For the inventionIn an embodiment, the performing backtracking merging processing on the binary tree storage structure includes: according to the left child node S at the bottommost layer of the binary tree storage structure_2i+1Find the corresponding parent node S_i+1(ii) a Calculating the parent node S_i+1With said parent node S_i+1Of brother node S_i+2A minimum horizontal distance therebetween; judging the father node S_i+1With said parent node S_i+1Of brother node S_i+2Whether the minimum horizontal distance between the two is not greater than a preset distance threshold value; if not, stopping backtracking and enabling the left child node S_2i+1The subsequence in (1) is determined as the answer sequence; if yes, backtracking is continuously carried out on the upper nodes of the binary tree storage structure.

In the embodiment of the invention, the parent node S is calculated_i+1With said parent node S_i+1Of brother node S_i+2The minimum horizontal distance therebetween may specifically include: obtaining the father node S_i+1Minimum and maximum x-coordinates (a1, a 2); obtaining brother node S of the father node_i+2Minimum and maximum x-coordinates (B1, B2); calculating the father node S according to a preset minimum horizontal distance formula_i+1Sibling node S with the parent node_i+2A minimum horizontal distance therebetween, the minimum horizontal distance formula comprising:

D＝max(A2,B2)-min(A1,B1)-(B2-B1)-(A2-A1)

For the embodiment of the present invention, the pair of the left child nodes S₂Before the clustering and detecting steps are repeatedly performed by the endpoint subsequence in (1), the method further comprises: for the right child node S₂The non-endpoint subsequence in (1) is clustered to obtain a thirdSubsequence and fourth subsequence, calculating the third subsequence, fourth subsequence and the left child node S respectively₁The minimum horizontal distance between the end point subsequences in (a); if the minimum horizontal distance is not larger than a preset distance threshold, merging and storing the corresponding third subsequence or fourth subsequence to the left child node S₁In (1).

Wherein the third subsequence and the fourth subsequence may be for the right child node S₂The non-endpoint subsequence in (1) is clustered to obtain two subsequences. In this embodiment, the third subsequence, the fourth subsequence and the left child node S are calculated₁The process of the minimum horizontal distance between the terminal sub-sequences in (1) is the same as that in step 205, and is not described herein again. The preset distance threshold may be a preset distance parameter τ, and in an actual application scenario, may be generally set to 30 or 40, and if the minimum horizontal distance is not greater than the preset distance threshold, the corresponding third subsequence or fourth subsequence is merged and stored to the left child node S₁Therefore, deletion of irrelevant answers is reduced, and accuracy of the final answer sequence is improved.

The invention provides a method for determining an answer sequence, which can acquire a text box sequence and store the text box sequence to a root node S of a binary tree storage structure₀Performing the following steps; clustering the text box sequence to obtain a first subsequence and a second subsequence; detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence simultaneously comprising a first text box and a second text box; if not, backtracking and merging the binary tree storage structure to obtain an answer sequence; if so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁Repeatedly executing clustering processing and detecting steps by the terminal subsequence in the binary tree until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence; output stationAnd (5) the answer sequence is described. Therefore, the technical problems that the scanned document structure and typesetting in a real scene are very complex, irrelevant characters are easily contained in the output problem answers, and the accuracy is low can be solved, and the accuracy of the problem answers is improved.

Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides an answer sequence determining apparatus, as shown in fig. 2, the apparatus includes:

an obtaining unit 21, configured to obtain a text box sequence, and store the text box sequence to a root node S of a binary tree storage structure₀Performing the following steps;

the processing unit 22 is configured to perform clustering processing on the text box sequence to obtain a first subsequence and a second subsequence, and detect whether an endpoint subsequence exists in the first subsequence and the second subsequence, where the endpoint subsequence is a subsequence that includes both the first text box and the second text box;

a backtracking unit 23, configured to perform backtracking merging processing on the binary tree storage structure if the answer sequence is not found in the binary tree storage structure, to obtain an answer sequence;

a merging unit 24, configured to save the endpoint subsequence to the root node S if yes₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁Repeatedly executing clustering processing and detecting steps by the terminal subsequence in the binary tree until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence;

and an output unit 25, configured to output the answer sequence.

Further, the apparatus further comprises: a calculating unit, a judging unit,

the computing unit is further specifically configured to compute the third subsequence, the fourth subsequence, and the left child node respectivelyS₁The minimum horizontal distance between the end point subsequences in (a);

Further, the backtracking unit includes:

D＝max(A2,B2)-min(A1,B1)-(B2-B1)-(A2-A1)

wherein D is the father node S_i+1Sibling node S with the parent node_i+2A1 is the parent node S_i+1Is the parent node S, A2_i+1B1 is the parent node S_i+1Of brother node S_i+2Is the most important ofSmall x coordinate, B2 being the parent node S_i+1Of brother node S_i+2The maximum x coordinate of (c).

Further, the processing unit comprises;

Further, the apparatus further comprises:

According to an embodiment of the present invention, a storage medium is provided, and the storage medium stores at least one executable instruction, and the computer executable instruction can execute the method for determining the answer sequence in any of the above method embodiments.

Fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computer device.

As shown in fig. 3, the computer apparatus may include: a processor (processor)302, a communication Interface 304, a memory 306, and a communication bus 308.

Wherein: the processor 302, communication interface 304, and memory 306 communicate with each other via a communication bus 308.

A communication interface 303 for communicating with network elements of other devices, such as clients or other servers.

The processor 302 is configured to execute the program 310, and may specifically execute the relevant steps in the above-described answer sequence determination method embodiment.

In particular, program 310 may include program code comprising computer operating instructions.

The processor 302 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The computer device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And a memory 306 for storing a program 310. Memory 306 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 310 may specifically be configured to cause the processor 302 to perform the following operations:

clustering the text box sequence to obtain a first subsequence and a second subsequence, and detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence which simultaneously comprises a first text box and a second text box;

and outputting the answer sequence.

By the technical scheme, the text box sequence can be stored to the root node S of the binary tree storage structure by acquiring the text box sequence₀Performing the following steps; clustering the text box sequence to obtain a first subsequence and a second subsequence; detecting whether an endpoint subsequence exists in the first subsequence and the second subsequence, wherein the endpoint subsequence is a subsequence simultaneously comprising a first text box and a second text box; if not, backtracking and merging the binary tree storage structure to obtain an answer sequence; if so, saving the endpoint subsequence to the root node S₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁Repeatedly executing clustering processing and detecting steps by the terminal subsequence in the binary tree until no terminal subsequence exists, and backtracking and combining the binary tree storage structure to obtain an answer sequence; and outputting the answer sequence. Therefore, the technical problems that the scanned document structure and typesetting in a real scene are very complex, irrelevant characters are easily contained in the output problem answers, and the accuracy is low can be solved, and the accuracy of the problem answers is improved.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for determining a sequence of answers, comprising:

and outputting the answer sequence.

2. The method of claim 1, wherein the pair of the left child nodes S₂Before the clustering and detecting steps are repeatedly performed by the endpoint subsequence in (1), the method further comprises:

3. The method according to claim 1, wherein the performing traceback merge processing on the binary tree storage structure includes:

4. The method of claim 3, wherein computing the parent node S_i+1Sibling node S with the parent node_i+2The most important of the twoA small horizontal distance comprising:

obtaining the father node S_i+1Minimum and maximum x-coordinates (a1, a 2);

D＝max(A2,B2)-min(A1,B1)-(B2-B1)-(A2-A1)

5. The method of claim 1, wherein the clustering the text box sequence to obtain a first subsequence and a second subsequence comprises:

6. The method of claim 5, wherein the k-means clustering process is performed on the text box sequence to obtain a first subsequence and a second subsequence, including;

7. The method of claim 1, wherein obtaining the sequence of text boxes, saving the sequence of text boxes to a root node S of a binary tree storage structure₀Before, the method further comprises: detecting and identifying the obtained scanned document by using an optical character recognition system to obtain a text box cluster;

arranging the text box clusters according to a preset sequence;

8. An answer sequence determination apparatus, comprising:

a merging unit, configured to save the endpoint subsequence to the root node S if yes₀Left child node S of₁In (2), saving the non-endpoint subsequence to the root node S₀Right child node S of₂And for the left child node S₁The end point subsequence in (1) repeatedly performs clustering processing anddetecting, namely backtracking and merging the binary tree storage structure until no end point subsequence exists to obtain an answer sequence;

and the output unit is used for outputting the answer sequence.

9. A storage medium having stored therein executable instructions for causing a processor to perform operations corresponding to the determination method of answer sequence according to any one of claims 1-7.

10. A computer device, comprising: a processor, a memory;

the memory is used for storing executable instructions which enable the processor to execute the operation corresponding to the answer sequence determination method of any one of claims 1-7.