CN112733513A - Method, system, terminal and storage medium for automatically sorting airline driver change-back rules - Google Patents

Method, system, terminal and storage medium for automatically sorting airline driver change-back rules Download PDF

Info

Publication number
CN112733513A
CN112733513A CN202110037555.3A CN202110037555A CN112733513A CN 112733513 A CN112733513 A CN 112733513A CN 202110037555 A CN202110037555 A CN 202110037555A CN 112733513 A CN112733513 A CN 112733513A
Authority
CN
China
Prior art keywords
text
change
cell
information
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110037555.3A
Other languages
Chinese (zh)
Inventor
朱小武
吴芹
陈志刚
冯嵛
黄雪萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongcheng Network Technology Co Ltd
Original Assignee
Tongcheng Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongcheng Network Technology Co Ltd filed Critical Tongcheng Network Technology Co Ltd
Priority to CN202110037555.3A priority Critical patent/CN112733513A/en
Publication of CN112733513A publication Critical patent/CN112733513A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The application relates to a method, a system, a terminal and a storage medium for automatically finishing a navigation driver change-back rule, which belong to the field of information technology, wherein the method comprises the steps of establishing a text coordinate system; acquiring text coordinate information of each character in a text to be processed; converting the text to be processed into an image file and obtaining an image processing result; acquiring table coordinate information in an image file; establishing text frames, wherein each text frame comprises a plurality of cells and corresponding sub information sections; allocating a reversion label for each cell of each text frame according to a preset allocation mark model; reading a historical quit-change rule table; filling the sub information segment into a historical quit-change rule table according to the quit-change label to obtain a current quit-change rule table; and feeding back the current change-quit rule table to the administrator terminal. The method and the device have the effect of improving the accuracy of entry of the label returning and changing rule of the navigation department.

Description

Method, system, terminal and storage medium for automatically sorting airline driver change-back rules
Technical Field
The present application relates to the field of information technology, and in particular, to a method, a system, a terminal, and a storage medium for automatically sorting airline driver fallback rules.
Background
With the rapid development of the internet, the demand of users for online booking of airline tickets is increasing. The user can operate and check own order and air ticket travel information on line on the mobile phone. When the travel is changed, the user can check the ticket refunding and changing rules of the navigation department on line, know the procedure cost of ticket refunding and changing in real time, and then operate the ticket refunding or the changing.
At present, each online travel OTA platform collects rules of label change and withdrawal of a navigation department, and most rules of label change and withdrawal are manually input based on a red-head file issued by the navigation department. After receiving the file of the navigation department, the OTA service personnel firstly reads the complete file, then inputs the rule quitting and changing information into the excel form in a manual editing mode according to the form information related to the rule quitting and changing, supplements and edits the extended information related to the rule quitting and changing, and finally manually inputs or imports the extended information into the rule quitting and changing information system. Therefore, when the user accesses the rule information page for quitting and changing signature through the mobile phone, the current rule information for quitting can be checked in real time.
The related art described above has the following drawbacks: when the staff manually inputs the information of the ticket withdrawing and changing rule, the ticket withdrawing and changing rule is easy to input errors due to personal carelessness, and the passenger withdraws and changes the procedure and the expense calculation errors, so that great loss is caused to a company.
Disclosure of Invention
In order to improve the accuracy of entry of the label returning and changing rules of the navigation department, the application provides a method for automatically arranging the label returning and changing rules of the navigation department.
In a first aspect, the application provides a method for automatically arranging a driver's change-back rule, which adopts the following technical scheme:
a method for automatically arranging a driver's change-back rule comprises the following steps:
establishing a text coordinate system according to the text to be processed;
acquiring text coordinate information of each character in the text to be processed according to the established text coordinate system;
converting the text to be processed into an image file and obtaining an image processing result;
according to the image processing result, table coordinate information in the image file is obtained, wherein the table coordinate information comprises contour coordinate information of each table and corresponding cell coordinate information;
establishing text frames according to the table coordinate information and the text coordinate information, wherein each text frame comprises a plurality of cells and corresponding sub information sections;
allocating a change-back label for each cell of each text frame according to a preset allocation mark model, wherein the change-back label comprises a main label and a sub label;
reading a historical change-quitting rule table, wherein the historical change-quitting rule table comprises a plurality of main tags and sub-tags;
filling the sub information segment into a historical quit-change rule table according to the quit-change label to obtain a current quit-change rule table;
and feeding back the current change-quit rule table to an administrator terminal.
By adopting the technical scheme, the form in the text and the corresponding sub-information segment of each cell in the form are automatically identified according to the text to be processed issued by the navigation department, the change-quitting label is distributed to the sub-information segments in the cells, the current change-quitting rule form can be automatically generated according to the change-quitting label, manual entry is not needed, and the accuracy of the change-quitting rule entry is improved.
Optionally, each sub-label corresponds to a secondary label, the secondary label may include a plurality of sub-labels of the same type and different conditions, and the allocating a change-back label for a cell in each text frame according to a preset allocation mark model specifically includes:
acquiring title information of each text frame, and endowing each text frame with a main label according to the title information;
performing conditional title screening on the sub information segments in the text frame according to a preset analysis model, and acquiring a conditional title screening result;
obtaining a condition cell for quitting modification according to the condition title screening result;
according to a preset distribution principle, distributing sub-labels for each cell with the change-back condition;
acquiring a quit-change information cell according to the quit-change condition cell;
acquiring longitudinal cells according to the cell under the condition of quitting change and a preset longitudinal acquisition principle;
endowing the sub-label corresponding to the current condition cell with the longitudinal cell;
acquiring a horizontal cell according to a preset horizontal acquisition principle;
and assigning the sub-label corresponding to the current condition cell to the horizontal cell.
By adopting the method, the unit cells are divided into the condition-quitting and change-quitting unit cells and the information-quitting unit cells, and at least one label-quitting and change-quitting is given to each unit cell according to the division result, so that the accuracy of label-quitting and change-quitting rule entry is improved.
Optionally, the table coordinate information further includes page number information, and after the table coordinate information in the image file is acquired according to the image processing result, the method further includes:
judging whether two tables on two continuous pages exist according to the page number information;
if so, judging whether the minimum vertical coordinate difference value between the two tables on the continuous pages is a preset combined value or not according to the contour coordinate information;
if the two tables on the continuous page numbers are judged to be the same, merging the two tables on the continuous page numbers into a new table;
and acquiring the table coordinate information of the new table according to the contour coordinate information.
By adopting the technical scheme, the condition cells for quitting and changing are firstly screened out, the sub-labels are distributed to the condition cells for quitting and changing, the information cells for quitting and changing corresponding to the condition cells for quitting and changing are obtained according to the condition cells for quitting and changing, the sub-labels are distributed to the information cells for quitting and changing according to the longitudinal obtaining principle and the transverse obtaining principle, the accuracy of distributing the labels for quitting and changing is improved, and the accuracy of the regular entry of the labels for quitting and changing is improved.
Optionally, after obtaining the table coordinate information in the image file according to the image processing result, the method further includes:
judging whether two tables on two continuous pages exist or not according to the page number information;
if the difference value is judged to be the minimum vertical coordinate difference value between the two tables on the continuous page numbers according to the contour coordinate information;
judging whether the minimum longitudinal coordinate difference value is a preset combined value or not;
if the two tables on the continuous page numbers are judged to be the same, merging the two tables on the continuous page numbers into a new table;
and acquiring the table coordinate information of the new table according to the contour coordinate information.
By adopting the technical scheme, the tables are calibrated, the possibility that one table is judged to be two tables by the system due to page layout is avoided, and the occurrence of the incomplete data acquisition condition is reduced.
Optionally, after allocating a change-back label to the cells in each text frame according to a preset allocation label model, the method further includes:
acquiring all the change-quit information cells;
judging whether the main labels and the sub labels carried by two or more modified information cells are completely the same;
if the unit cells are judged to be abnormal, the unit cells carrying the main labels and the sub labels which are completely the same are marked as abnormal unit cells;
acquiring a text frame corresponding to the abnormal cell, and marking the acquired text frame as an abnormal text frame;
judging whether the abnormal text frame comprises two or more abnormal cells;
if the cell number is judged to be yes, the main labels and the sub labels corresponding to all the cells in the abnormal text box are cleared.
By adopting the technical scheme, the main label and the sub label carried by the change quitting information cell are verified, so that the accuracy of the current change quitting rule form is improved.
Optionally, after clearing the main tags and the sub-tags corresponding to all the cells in the abnormal text box, the method further includes:
performing text word segmentation on a text to be processed, and obtaining a word segmentation processing result, wherein the word segmentation processing result comprises a plurality of characteristic samples;
reading a main label and a sub label corresponding to the information-quitting unit cell, and marking the obtained main label and the obtained sub label as samples to be processed;
according to the sample to be processed and the word segmentation processing result, acquiring a characteristic sample with similarity exceeding a preset threshold with the sample to be processed, and marking the acquired sample as a comparison sample;
obtaining the distribution density of the comparison samples;
determining a characteristic block according to the distribution density;
acquiring the feature block coordinate information of the feature block;
obtaining cell coordinate information of a current quit-change information cell;
judging whether the change-quit information cell is an abnormal cell or not according to the coordinate information of the characteristic block and the coordinate information of the current change-quit information cell;
if the cell is judged to be the abnormal cell, the main label and the sub label of the abnormal cell are cleared.
By adopting the technical scheme, the information cell which is not changed is subjected to secondary inspection, the main label and the sub label carried by the information cell which is not changed are taken as samples to be processed, the text to be processed is subjected to word segmentation processing, a plurality of characteristic samples are obtained, a comparison sample with the similarity of the sample to be processed exceeding a preset threshold value is obtained, and whether the main label and the sub label corresponding to the information cell which is not changed are accurate or not is judged according to the distribution density of the comparison sample and the distance between the characteristic block and the information cell which is not changed, so that the accuracy of the regular entry of the information cell which is not changed is further improved.
Optionally, the method, based on a database including a plurality of template sentence libraries, performs text word segmentation on the text to be processed, and obtaining a word segmentation result, specifically includes:
dividing a text to be processed into a plurality of sentence segments according to a preset division rule;
performing text word segmentation processing on each sentence segment by adopting a first word segmentation model, and obtaining a first sentence segment processing result, wherein the first sentence segment processing result comprises a plurality of phrases into which the sentence segment is divided and a part of speech corresponding to each phrase;
judging whether the processing result of the first sentence segment is reasonable or not according to the part of speech corresponding to the phrase of the sentence segment;
if so, marking the first sentence fragment processing result as a template sentence fragment processing result;
if not, performing text word segmentation processing on the current sentence by adopting a second word segmentation model, and obtaining a second sentence segment processing result;
acquiring a template sentence fragment processing result according to the first sentence fragment processing result and the second sentence fragment processing result;
integrating the processing results of all template sentence segments and outputting the word segmentation processing results;
and storing the sentence segments and the corresponding template sentence processing results into a database.
By adopting the technical scheme, the rationality of the processing result of the first sentence fragment is checked, if unreasonable possibility exists, the second sentence fragment model is adopted to carry out secondary text word segmentation processing on the sentence fragments, reasonable template sentence fragment processing results are obtained through comparison, accurate characteristic samples are provided for the main label and the sub label of the proofreading and correction information cell, and meanwhile, the template sentence fragment processing results are stored in a database to serve as a learning template, so that the accuracy of text word segmentation processing is continuously improved.
Optionally, the filling the sub information segment into the historical change-back rule table according to the change-back tag to generate the current change-back rule table specifically includes:
filling the change quitting information cell into a historical change quitting rule table according to the distribution change quitting label to generate a preliminary change quitting rule table;
regularizing the sub-information segment contents in the preliminary quit-modification rule table, and acquiring a regularized processing result;
and generating a current quit-change rule table according to the regular processing result.
By adopting the technical scheme, the sub-information sections in the preliminary quit-and-modification rule table are subjected to regularization processing, so that more concise quit-and-modification information is obtained, and a user can conveniently look up the quit-and-modification information.
In a second aspect, the present application provides a system for automatically arranging a driver's change-back rule, which adopts the following technical scheme:
a system for automatically arranging airline hostess change-back rules, comprising:
the establishing module is used for establishing a text coordinate system according to the text to be processed;
the character module is used for acquiring the text coordinate information of each character in the text to be processed according to the established text coordinate system;
the image module is used for acquiring form coordinate information in the image file according to the image processing result, wherein the form coordinate information comprises contour coordinate information of each form, corresponding cell coordinate information and page number information;
the table module is used for acquiring table coordinate information in the image file according to the image processing result, wherein the table coordinate information comprises contour coordinate information of each table, corresponding cell coordinate information and page number information;
the generating module is used for establishing text frames according to the table coordinate information and the text coordinate information, and each text frame comprises a plurality of cells and corresponding sub information sections;
the distribution module is used for distributing a change-back label for each cell of each text frame according to a preset distribution mark model, wherein the change-back label comprises a main label and a sub label;
the reading module is used for reading a historical quit-change rule table, and the historical quit-change rule table comprises a plurality of main tags and sub-tags;
the filling module is used for filling the sub information segment into a historical change-quitting rule table according to the change-quitting label so as to obtain a current change-quitting rule table;
and the feedback module is used for feeding back the current quit-changing rule table to the administrator terminal.
By adopting the technical scheme, the text to be processed is automatically scanned, so that the tables in the text to be processed are screened out, the text frame is established, the cell in the text frame is allocated with the change-quitting label, and the current change-quitting rule table is generated according to the change-quitting label of the change-quitting cell, so that manual entry is not needed, and the accuracy of entry of the change-quitting rule is improved.
In a third aspect, the present application provides a method, which adopts the following technical solution:
an intelligent terminal comprising a memory and a processor, said memory having stored thereon a computer program that can be loaded by the processor and that executes the method according to the first aspect.
By adopting the technical scheme, an administrator can obtain the latest change-quitting rule information between the fed-back current change-quitting rule tables without manually sorting the input information, and the condition that the information input is wrong due to manual input can be avoided while the working efficiency is improved.
In a fourth aspect, the present application provides a computer-readable storage medium, which adopts the following technical solutions:
a computer readable storage medium storing a computer program that can be loaded by a processor and execute the method according to the first aspect.
By adopting the technical scheme, after the computer-readable storage medium is loaded into any computer, the computer can execute the automatic welding parameter adjusting method provided by the application.
In summary, the present application includes at least one of the following beneficial technical effects:
1. according to the text to be processed, the current quit-change rule form is automatically generated, manual entry is not needed, and the accuracy of the quit-change entry rule is improved;
2. the sub-information sections in the change quitting information cell are regularized, and the text is cleaned, so that the generated current change quitting rule table is simpler and clearer, and a user can conveniently look up the change quitting rule.
Drawings
Fig. 1 is a schematic flowchart of a method for automatically finishing a driver's fallback rule according to an embodiment of the present application.
Fig. 2 is a schematic flowchart illustrating a process of assigning a fallback tag to a cell in each text frame according to an embodiment of the present application.
Fig. 3 is an exemplary diagram illustrating an assignment of a fallback label to cells within each text frame according to an embodiment of the present application.
Fig. 4 is a schematic flowchart illustrating calibration of a modified information cell according to an embodiment of the present application.
Fig. 5 is an exemplary diagram of generating a preliminary revocation rule table according to an embodiment of the present application.
Fig. 6 is a schematic flowchart of text word segmentation processing performed on a text to be processed according to an embodiment of the present application.
Fig. 7 is a block diagram illustrating a structure of a system for automatically collating airline driver revocation rules according to an embodiment of the present invention.
Description of reference numerals: 1. establishing a module; 2. a text module; 3. an image module; 4. a table module; 5. a generation module; 6. a distribution module; 7. a reading module; 8. a filling module; 9. and a feedback module.
Detailed Description
The present application is described in further detail below with reference to figures 1-7.
The embodiment of the application discloses a method for automatically arranging airline hostess change-back rules. Referring to fig. 1, the method for automatically sorting the airline hostess change-back rule includes:
s100: and establishing a text coordinate system according to the text to be processed.
The text to be processed is specifically a red-head file issued by an airline company, the red-head file comprises character information and a plurality of table information, and the sign-off and change-over rules of the related airline company can be obtained by sorting the red-head file.
Specifically, a text coordinate system can be established according to the text to be processed, wherein the last line of the text to be processed is used as an abscissa, the first column of the text is used as an ordinate, the established coordinate system uses the width of one byte as an abscissa unit and the height of one byte as an ordinate unit, and each character in the text to be processed corresponds to one coordinate value. For example, in the text to be processed, the fourth word of the fifth last line corresponds to coordinates of [4,5 ].
S200: and acquiring the text coordinate information of each character in the text to be processed according to the established text coordinate system.
S300: and converting the text to be processed into an image file and obtaining an image processing result.
Wherein, S300 specifically includes:
s301: and converting the file to be processed into an image file.
In the example, the text to be processed is in the pdf file format initially, and the file to be processed is converted into the pdf picture format by a conventional conversion means to obtain the image file.
S302: and carrying out binarization processing on the image file.
In the example, since the input of many algorithms needs to be based on binary data, the image file is subjected to binarization processing to obtain a binarization processing result, which facilitates processing of the image file at a later stage. Specifically, the binarization processing of the image file specifically includes expanding and corroding the image file, and extracting a contour line.
S303: and acquiring form coordinate information according to the binarization processing result.
Specifically, the vertical lines, the horizontal lines, and the intersections between the vertical lines and the horizontal lines in the image file may be determined based on the contour lines acquired in S302, the table in the image file may be determined based on the vertical lines, the horizontal lines, and the intersections in the image file, and the table coordinate information may be determined based on the text coordinate system established in S100.
In the example, each table is formed by combining a plurality of cells, and the cells can be determined according to the intersection points, so that the acquired table coordinate information specifically includes contour coordinate information, cell coordinate information and page number information corresponding to each table. Specifically, the contour coordinate information includes coordinate values of two diagonal angles of the table; the cell coordinate information comprises coordinate values of two diagonal angles corresponding to each cell; according to the contour coordinate information and the cell coordinate information, the position and the range of the table can be determined. For example, if the contour coordinate information is obtained as { [2,50] < u [25,42] }, the relative range of the current table with respect to the text coordinate system can be determined according to the contour coordinate information.
Further, since the file to be processed is in the pdf picture format, when the table is generated, the table may be split into two tables due to page layout reasons, and the influence may be caused on the later data processing, so after the table coordinate information is generated, it may be determined, according to the page number information of each table, whether the abscissa ranges of the two tables are the same and are distributed on continuous pages on two sides, if yes, the minimum ordinate difference between the two tables is obtained, whether the obtained minimum ordinate difference is a preset combination value is determined, if yes, the two tables are combined, and new table coordinate information is generated, where the combination value is specifically the sum of the upper and lower page margins between the two pages.
For example, if the contour coordinate of table a is { [2,50] < 25,42] }, the located page number is P32, the contour coordinate of table B is { [2,38] < 25,30] }, and the located page number is P33, then it is known that the abscissa ranges of table a and table B are 23 unit values, the ordinate difference value of the two tables is 4 unit values, and the preset combined value is 4, then table a and table B are combined to generate a new table, and the corresponding contour coordinate is { [2,50] < 25,30] }.
S400: and establishing a text frame according to the table coordinate information and the text coordinate information.
Specifically, the text information in the table frame selection range can be screened out according to the outline coordinate information and the text coordinate information, the text information corresponding to each table can be divided into a plurality of sub-information sections according to the cells, and each sub-information section corresponds to one cell, so that a text frame is established.
S500: cells within each text frame are assigned a fallback label.
The modified label specifically comprises a main label and a sub label.
With reference to fig. 2, S500 specifically includes:
s501: title information of each text frame is acquired.
Specifically, the header information may be expanded by one unit in the positive and negative directions of the Y axis of the text coordinate system based on the vertical coordinate range of the table outline coordinate information according to the outline coordinate information corresponding to the text frame, to obtain a text with the "in" mark in the range, and mark the text information in the "in" as the header information.
For example, referring to fig. 3, the outline coordinates corresponding to the table a are { [2,51] < u [16,41] }, a text with a "< >" mark in the range of { [2,52] < u [16,40] } is obtained, and the obtained text is marked as header information.
S502: and assigning a main label to the text frame according to the title information.
In an example, the database comprises a plurality of change-removing tags, each change-removing tag comprises a main tag and a sub-tag, each main tag corresponds to each airline company one by one, for example, the main tag in the main tag corresponding to the middle joint navigation is the joint navigation main tag, the main tag corresponding to the east navigation is the east navigation main tag, the key words in the header information can be read according to the header information, each text frame is endowed with one main tag according to the read key words, and each cell in the text frame is endowed with the same main tag. With reference to fig. 3, the header information is "medium-integrated domestic freight rate applicable condition table", and the keyword "medium-integrated voyage" is extracted according to the header information, and then "medium-integrated voyage main label" is assigned to the current text frame.
S503: and labeling each cell in the text frame according to a preset rule.
Specifically, each cell in the text frame is labeled according to a sequence from left to right and from top to bottom by taking a text coordinate axis as a reference. For example, the labeled results corresponding to Table A are shown in FIG. 3.
S504: and performing conditional title screening on the sub information sections in the text frame according to a preset analysis model.
Specifically, conditional title keywords are preset in the database, and whether each sub-information segment is a conditional title or not can be judged according to the text information of each sub-information segment, so that conditional title screening is performed, and a conditional title screening result is obtained.
S505: and determining whether the condition cell is changed or not according to the condition title screening result.
And marking the cell which is judged to be the drinking of the sub information section of the condition title as a condition cell for returning the condition. Specifically, the keywords in each sub-information segment can be read, whether conditional heading information exists or not is judged, and if the conditional heading information exists, the corresponding cell is marked as a condition-quitting cell. For example, cell (3), cell (6), cell (7), cell (8), cell (9), cell (12), cell (13), cell (16) may be obtained as a fallback conditional cell.
S506: a sub-label is assigned to the dismiss conditional cell.
Specifically, a corresponding sub-label can be allocated to each quit-changing condition cell according to the keyword information in each quit-changing condition cell.
The database comprises a plurality of secondary labels, each secondary label comprises a plurality of sub-labels, and each sub-label corresponds to a unique secondary label.
In an example, the secondary tags include a bay tag, a product name tag, a change tag, a time period tag. Specifically, the bay labels include a plurality of sub-labels represented by codes, such as W sub-labels, Y sub-labels, and the like, and the bay labels can be used to represent the condition of the bay to be changed corresponding to each different bay; the product label comprises a plurality of sub-labels represented by product names, such as a comfortable flyer label, a popular tourist sub-label and the like, and can be used for representing the corresponding condition of the products of different types; the change label comprises a plurality of sub-labels used for representing the change-back condition, including a voluntary change sub-label and a voluntary ticket-returning sub-label; the time period labels comprise a plurality of sub-labels represented by different time periods, and specifically comprise (> 168) sub-labels, (168-72) sub-labels, (72-4) sub-labels and (< 4) sub-labels, which can be respectively used for representing the refund conditions corresponding to different refund times of a user, wherein the refund applications are respectively proposed above 168 hours before flying, 168-72 hours before flying, 72-4 hours before flying and 4 hours before flying.
With reference to fig. 3, if a keyword "W" is identified, a "W sub-label" is assigned to the cell (8), which indicates that the code corresponding to the cell is a condition for modifying corresponding to the W compartment, and by analogy, a "P sub-label" is assigned to the cell (12), a "Y sub-label" is assigned to the cell (16), a "happy flier sub-label" is assigned to the cell (9), a "popular wandering sub-label" is assigned to the cell (13), a "voluntary change sub-label" is assigned to the cell (3), a "(> 168) sub-label" is assigned to the cell (6), and a "(168-72) sub-label) is assigned to the cell (7).
S507: and acquiring the cell of the alteration information.
Specifically, the cell of the change-back information is a cell which does not carry any sub-tag in the current text frame. Referring to fig. 3, the cells of the alteration information in the table a are specifically a cell (10), a cell (11), a cell (14), a cell (15), a cell (17), and a cell (18).
S508: and acquiring the longitudinal cell corresponding to the cell under the condition of quit and change according to a longitudinal acquisition principle.
Specifically, the longitudinal cells are based on the current text frame, and the longitudinal coordinates of the longitudinal cells are smaller than the change quitting information cells of the change quitting condition cells within the range of the abscissa of the change quitting condition cells, so that the obtained cells are the longitudinal cells.
With reference to fig. 3, the longitudinal cells corresponding to the dismissal condition cell (3) are specifically: a cell (10), a cell (11), a cell (14), a cell (15), a cell (17), and a cell (18); the cells (8) with the condition of correction have no corresponding longitudinal cells.
S509: and endowing the sub-label corresponding to the current condition cell with the longitudinal cell.
S510: and acquiring the horizontal unit grids according to a horizontal acquisition principle.
Specifically, the horizontal cells are based on the current text frame, and the horizontal cells are the horizontal cells obtained in the range of the ordinate of the deproved conditional cells, wherein the abscissa of the horizontal cells is larger than the deproved information cells of the deproved conditional cells.
Referring to fig. 3, the horizontal cells corresponding to the cells (12) in the condition of dismissal are the cells (14) and the cells (15); the degenerate conditional cell (6) has no corresponding horizontal cell.
S511: and assigning the sub-label corresponding to the current condition cell to the horizontal cell.
For example, referring to fig. 3, in table a, if a cell (3) is taken as a current condition cell, the corresponding longitudinal cells, specifically, the cell (10), the cell (11), the cell (14), the cell (15), the cell (17), and the cell (18) are obtained, and the same secondary label as the cell (3) is obtained in all the cells, so that the sub-label "voluntarily change the sub-label" corresponding to the cell (3) is assigned to the longitudinal cell, and the cell (3) has no transverse cell; and taking the cell (8) as the current condition-quitting cell, judging that the cell (8) has no corresponding longitudinal cell, acquiring the transverse cell corresponding to the cell (8), specifically the cell (10) and the cell (11), and endowing the transverse cell with the sub-label 'W sub-label' corresponding to the cell (8). Through calculation, the finally obtained cell (10) specifically carries four sub-labels of 'W sub-label', 'voluntary change sub-label', '(> 168) sub-label' and 'comfortable flying sub-label', and 'middle joint navigation main label'.
S600: and calibrating the cell of the change-quit information according to the change-quit label.
With reference to fig. 4, S600 specifically includes:
s601: and judging whether the two or more modified information cells carry completely the same modified labels.
If not, jumping to S606;
if yes, the process goes to S602.
If the main label and the sub label carried by two or more of the change-back information cells are completely the same, it indicates that an error occurs when the change-back label is allocated, and further modification is needed.
S602: and marking the cells of the change-quitting information carrying the same change-quitting label as abnormal cells.
S603: and acquiring an abnormal text frame according to the abnormal cell.
Specifically, according to the coordinate information of the abnormal cell, a text frame corresponding to the abnormal cell can be obtained, and the obtained text frame is marked as the abnormal text frame.
S604: and judging whether the abnormal text frame contains two or more abnormal cells.
If the judgment result is yes, jumping to S605;
if not, the process goes to S606.
S605: and resetting the change-reversing labels corresponding to all the cells in the abnormal text frame.
If a text frame contains a plurality of abnormal cells, which indicates that the sub-information segment in the text frame has no advisability, the cell data in the abnormal text frame is cleared, and errors in entry of the dismissal information can be avoided.
S606: and performing text word segmentation on the text to be processed, and obtaining a word segmentation processing result.
Specifically, the text to be processed is processed by using Chinese word segmentation, which is to divide a sentence into a plurality of phrases according to the part of speech. For example, "i am making a table" may be split into "i" (pronouns), "is" (adverbs), "is" (verbs), "table" (nouns).
The word segmentation processing result comprises a plurality of characteristic samples, and the characteristic samples are specifically each word group obtained through processing and the corresponding part of speech thereof.
S607: and reading the main label and the sub label corresponding to the information-quitting unit cell, and marking the obtained main label and the obtained sub label as samples to be processed.
S608: and obtaining a comparison sample.
Specifically, the comparison sample can be obtained according to the sample to be processed and the word segmentation processing result, and the feature sample with the similarity exceeding the preset threshold with the sample to be processed is marked as the comparison sample. For example, if the sample to be processed is "middle joint navigation + resource change + W + diastole", the feature sample carrying any of the related words is marked as a comparison sample.
S609: and obtaining the distribution density of the comparison sample.
Wherein, the distribution density is the distribution density of the comparison sample in the unit area.
S610: and determining the characteristic blocks according to the distribution density.
The characteristic block is the block with the most dense distribution of comparison samples in a unit area, and the block is marked as the characteristic block.
S611: and acquiring the feature block coordinate information of the feature block.
S612: and acquiring the cell coordinate information of the current information-quitting cell.
S613: and judging whether the information cell of the change quitting information is an abnormal cell or not according to the coordinate information of the characteristic block and the coordinate information of the cell of the current information cell of the change quitting information.
If yes, jumping to S614;
if not, the process goes to S700.
Specifically, the distance between the two blocks in the longitudinal coordinate direction is judged according to the coordinate information of the feature block and the coordinate information of the cell, so that the longitudinal distance between the information-modifying-removing cell and the related text information is judged, whether the connection between the information-modifying-removing cell and the related text information is tight is judged, if the longitudinal distance between the information-modifying-removing cell and the related text information exceeds a preset threshold value, the tightness is not high, the sub-information section in the information-modifying-removing cell is not available, and therefore the current information-modifying-removing cell is marked as an abnormal cell.
S614: and clearing the change-canceling label of the abnormal cell.
And after S614 is finished, jumping to S700.
In the example, after the main tag and the sub-tag of the abnormal cell are removed, when the current quit rule table is made at a later stage, the sub-information segment corresponding to the abnormal cell cannot be recorded.
S700: and reading a historical retroversion rule table.
The historical change-quitting rule table is a change-quitting rule table before updating, and after the driver issues a new change-quitting rule, the historical change-quitting rule table needs to be modified according to a red-header file issued by the driver so as to obtain a latest historical change-quitting rule table. With reference to fig. 5, the historical change-quitting rule table is composed of a plurality of change-quitting information cells and change-quitting condition cells, each change-quitting information cell has a corresponding main tag and a plurality of sub-tags, and the main tag and the sub-tags carried by each change-quitting information cell are different from each other.
S800: and filling the sub information segments into the historical quit-change rule table according to the quit-change label to obtain a preliminary quit-change rule table.
Specifically, each sub information segment corresponds to one cell, each cell is paired with one main label and a plurality of sub labels, the cells carrying the completely same labels are matched according to the main labels and the sub labels carried by the information-modifying-removing cells in the historical modifying-removing rule table, and the matched cells are filled into the historical modifying-removing rule table, namely, a preliminary modifying-removing rule table is generated. With reference to fig. 3 and 5, in the table a, the cell (10) carries four sub-tags of "W sub-tag", "voluntary change sub-tag", "(> 168) sub-tag" and "comfortable flying sub-tag", and a main tag of "intermediate-navigation main tag", the tags carried by the cells in the history change-back rule table are specifically "W sub-tag", "voluntary change sub-tag", "(> 168) sub-tag", "comfortable flying sub-tag" and "intermediate-navigation main tag", and the sub-information segments corresponding to the cell (10) in the table a are filled into the cell corresponding to the history change-back rule table to update the history change-back rule table, so as to obtain the preliminary change-back rule table.
S900: and performing regularization processing on the sub-information segment contents in the preliminary modification rule table.
The regularization processing is to remove invalid and special characters from the sub-information segments, so as to extract effective change-quitting rule data, and after the regularization processing, the change-quitting rule table can be simplified, so that a user can conveniently look up the change-quitting rule table.
S1000: and generating a current change quitting rule table.
Specifically, the current change-quit rule table is a preliminary change-quit rule table after the sub-information segments are subjected to regularization processing.
S1100: and feeding back the current change-quit rule table to the administrator terminal.
Furthermore, the cells corresponding to the abnormal cells can be marked with bright colors in the current quit-change rule cells to prompt an administrator to perform secondary detection on the abnormal cells, so that the accuracy of the current quit-change table is ensured.
For S606, in the embodiment, the flow of text word segmentation processing on the text to be processed may be specifically as shown in fig. 6:
s061: and dividing the text to be processed into a plurality of sentence segments according to a preset division rule.
Specifically, may be ". "is a division criterion, with each segment". "the sentence is a sentence fragment.
S062: and performing text word segmentation processing on each template sentence segment by adopting a first word segmentation model to obtain a first sentence segment processing result.
Wherein, S061-S069 is based on a database, which carries a plurality of template sentence segments; the text word segmentation processing can be carried out on the sentence segments in the text to be processed according to the word segmentation mode of the template sentence segments.
Specifically, the first word segmentation model may perform text word segmentation processing on the sentence segments, and the processing result may split one sentence segment into a plurality of word groups, where each word group carries a corresponding part-of-speech identifier.
S063: and judging whether the processing result of the first sentence segment is reasonable or not according to the part-of-speech composition in the sentence segment.
If the judgment result is yes, jumping to S064;
if not, the process goes to S065.
Specifically, a sentence is composed of a plurality of phrases with different parts of speech, and whether the processing result of the first sentence segment is reasonable or not can be judged according to whether the parts of speech corresponding to the phrases composing the sentence segment are reasonable or not. For example, if a sentence fragment is subjected to text word segmentation and the sentence fragment processing result shows that the sentence fragment is composed of a plurality of word groups with part of speech being adjectives, it is determined that the first sentence fragment processing result is unreasonable.
S064: and marking the first sentence fragment processing result as a template sentence fragment processing result.
And after S064 is finished, jumping to S069.
S065: and performing text word segmentation processing on the current sentence segment by adopting a second word segmentation model, and acquiring a second sentence segment processing result.
If the sentence fragment processing result of the first word segmentation model for performing text word segmentation processing on the current sentence fragment is not reasonable, the second word segmentation model is used for performing text word segmentation processing on the current sentence fragment, and the first word segmentation model and the second word segmentation model respectively adopt different processing models, so that two different sentence fragment processing results can be obtained after performing word segmentation processing on the same sentence fragment.
S066: and judging whether the first statement section processing result is consistent with the second statement section processing result.
If the judgment result is yes, jumping to S064;
if not, the process goes to S067.
And if the processing result of the first statement section is completely consistent with the processing result of the second statement section, the processing result of the first statement section is reasonable.
S067: and judging whether the number of the part of speech types contained in the first sentence fragment processing result is greater than the number of the part of speech types contained in the second sentence fragment processing result.
If not, jumping to S068;
if yes, the process goes to S064.
If the number of the part-of-speech types contained in the first sentence fragment processing result is greater than the number of the part-of-speech types contained in the second sentence fragment processing result, it is indicated that the first sentence fragment processing result is more reasonable than the second sentence fragment processing result, and the first sentence fragment processing result is obtained as the template sentence fragment processing result. For example, a sentence fragment is specifically "hard to physically", and the first sentence fragment processing result obtained by processing with the first word segmentation model is specifically: "physics" (noun), "learning" (verb), "up" (possible verb), "true" (adverb), "difficulty" (adjective); and processing by using a second word segmentation model to obtain a second sentence segment processing result, wherein the second sentence segment processing result is specifically as follows: the first sentence segment processing result carries five phrases with different parts of speech, and the second sentence segment processing result carries four phrases with different parts of speech, so that the first sentence segment processing result is obtained as the template sentence segment processing result.
S068: and marking the second sentence fragment processing result as a template sentence fragment processing result.
S069: and storing the sentence segments and the corresponding template sentence segment processing results in a database.
Specifically, the sentence fragment processing result with the most reasonable processing result is stored in the database, and the template sentence fragment processing result can be used as a training model to provide a template for the next word segmentation processing result, so that the word segmentation accuracy is improved, and the accuracy of making the current regression rule table is further improved.
S0610: and integrating the processing results of all the template sentence periods.
S0611: and outputting the word segmentation processing result according to the processing results of all the template sentence segments.
The implementation principle is as follows: according to the text coordinate information and the form coordinate information of the text to be processed, the text coordinate information and the form coordinate information can be matched, the unit cells of the change-back rule and the unit cells of the change-back information are obtained according to the keywords, the change-back labels are distributed for each unit cell, the historical change-back rule table is updated according to the change-back labels, the current change-back rule table is obtained, the current change-back rule table is automatically generated, and the accuracy of entry of the change-back rule of the navigation department is effectively improved.
Based on the method, the embodiment of the application also discloses a system for automatically sorting the airline hostess change-back rule. Referring to fig. 7, the system for automatically sorting the driver's change-back rule includes: the system comprises a building module 1, a character module 2, an image module 3, a table module 4, a generating module 5, an allocating module 6, a reading module 7, a filling module 8 and a feedback module 9.
The establishing module 1 is used for establishing a text coordinate system according to the text to be processed.
And the character module 2 is used for acquiring the text coordinate information of each character in the text to be processed according to the established text coordinate system.
And the image module 3 is used for acquiring form coordinate information in the image file according to the image processing result, wherein the form coordinate information comprises contour coordinate information of each form, corresponding cell coordinate information and page number information.
And the table module 4 is used for acquiring table coordinate information in the image file according to the image processing result, wherein the table coordinate information comprises contour coordinate information of each table, corresponding cell coordinate information and page number information.
And the generating module 5 is used for establishing text frames according to the table coordinate information and the text coordinate information, and each text frame comprises a plurality of cells and corresponding sub information sections.
And the distribution module 6 is used for distributing a change-back label for the cells of each text frame according to a preset distribution mark model, wherein the change-back label comprises a main label and a sub label.
And the reading module 7 is used for reading a history quit-change rule table, and the history quit-change rule table comprises a plurality of main tags and sub tags.
And the filling module 8 is used for filling the sub information segments into the historical change-back rule table according to the change-back labels so as to obtain the current change-back rule table.
And the feedback module 9 is used for feeding back the current change-quit rule table to the administrator terminal.
The embodiment of the application also discloses an intelligent terminal which comprises a memory and a processor, wherein the memory is stored with a computer program which can be loaded by the processor and can execute the method for automatically finishing the airline hostess change-back rule.
An embodiment of the present application further discloses a computer-readable storage medium storing a computer program capable of being loaded by a processor and executing a method such as automatically sorting airline driver fallback rules, the computer-readable storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above examples are only used to illustrate the technical solutions of the present application, and do not limit the scope of protection of the application. It is to be understood that the embodiments described are only some of the embodiments of the present application and not all of them. All other embodiments, which can be derived by a person skilled in the art from these embodiments without making any inventive step, are within the scope of the present application.

Claims (10)

1. A method for automatically arranging a driver's change-back rule is characterized by comprising the following steps:
establishing a text coordinate system according to the text to be processed;
acquiring text coordinate information of each character in the text to be processed according to the established text coordinate system;
converting the text to be processed into an image file and obtaining an image processing result;
according to the image processing result, table coordinate information in the image file is obtained, wherein the table coordinate information comprises contour coordinate information and cell coordinate information of each table;
establishing text frames according to the table coordinate information and the text coordinate information, wherein each text frame comprises a plurality of cells and corresponding sub information sections;
allocating a change-back label for each cell of each text frame according to a preset allocation mark model, wherein the change-back label comprises a main label and a sub label;
reading a historical change-quitting rule table, wherein the historical change-quitting rule table comprises a plurality of main tags and sub-tags;
filling the sub information segment into a historical quit-change rule table according to the quit-change label to obtain a current quit-change rule table;
and feeding back the current change-quit rule table to an administrator terminal.
2. The method according to claim 1, wherein each sub-label corresponds to a secondary label, the secondary label may include a plurality of sub-labels of the same type and different conditions, and the allocating a fallback label to a cell in each text frame according to a preset allocation marking model specifically includes:
acquiring title information of each text frame, and endowing each text frame with a main label according to the title information;
performing conditional title screening on the sub information segments in the text frame according to a preset analysis model, and acquiring a conditional title screening result;
obtaining a condition cell for quitting modification according to the condition title screening result;
according to a preset distribution principle, distributing sub-labels for each cell with the change-back condition;
acquiring a quit-change information cell according to the quit-change condition cell;
acquiring longitudinal cells according to the cell under the condition of quitting change and a preset longitudinal acquisition principle;
endowing the sub-label corresponding to the current condition cell with the longitudinal cell;
acquiring a horizontal cell according to a preset horizontal acquisition principle;
and assigning the sub-label corresponding to the current condition cell to the horizontal cell.
3. The method according to claim 1, wherein the table coordinate information further includes page number information, and after acquiring the table coordinate information in the image file according to the image processing result, the method further includes:
judging whether two tables on two continuous pages exist according to the page number information;
if so, judging whether the minimum vertical coordinate difference value between the two tables on the continuous pages is a preset combined value or not according to the contour coordinate information;
if the two tables on the continuous page numbers are judged to be the same, merging the two tables on the continuous page numbers into a new table;
and acquiring the table coordinate information of the new table according to the contour coordinate information.
4. The method of claim 2, wherein after assigning a fallback tag to the cells in each text frame according to a preset assignment tagging model, further comprising:
acquiring all the change-quit information cells;
judging whether the main labels and the sub labels carried by two or more modified information cells are completely the same;
if the unit cells are judged to be abnormal, the unit cells carrying the main labels and the sub labels which are completely the same are marked as abnormal unit cells;
acquiring a text frame corresponding to the abnormal cell, and marking the acquired text frame as an abnormal text frame;
judging whether the abnormal text frame comprises two or more abnormal cells;
if the cell number is judged to be yes, the main labels and the sub labels corresponding to all the cells in the abnormal text box are cleared.
5. The method according to claim 4, wherein after clearing the main labels and the sub labels corresponding to all the cells in the abnormal text box, the method further comprises:
performing text word segmentation on a text to be processed, and obtaining a word segmentation processing result, wherein the word segmentation processing result comprises a plurality of characteristic samples;
reading a main label and a sub label corresponding to the information-quitting unit cell, and marking the obtained main label and the obtained sub label as samples to be processed;
according to the sample to be processed and the word segmentation processing result, acquiring a characteristic sample with similarity exceeding a preset threshold with the sample to be processed, and marking the acquired sample as a comparison sample;
obtaining the distribution density of the comparison samples;
determining a characteristic block according to the distribution density;
acquiring the feature block coordinate information of the feature block;
obtaining cell coordinate information of a current quit-change information cell;
judging whether the change-quit information cell is an abnormal cell or not according to the coordinate information of the characteristic block and the coordinate information of the current change-quit information cell;
if the cell is judged to be the abnormal cell, the main label and the sub label of the abnormal cell are cleared.
6. The method according to claim 5, wherein the method is based on a database containing a plurality of template sentence libraries, and the performing the text segmentation processing on the text to be processed and obtaining the segmentation processing result specifically comprises:
dividing a text to be processed into a plurality of sentence segments according to a preset division rule;
performing text word segmentation processing on each sentence segment by adopting a first word segmentation model, and obtaining a first sentence segment processing result, wherein the first sentence segment processing result comprises a plurality of phrases into which the sentence segment is divided and a part of speech corresponding to each phrase;
judging whether the processing result of the first sentence segment is reasonable or not according to the part of speech corresponding to the phrase of the sentence segment;
if so, marking the first sentence fragment processing result as a template sentence fragment processing result;
if not, performing text word segmentation processing on the current sentence by adopting a second word segmentation model, and obtaining a second sentence segment processing result;
acquiring a template sentence fragment processing result according to the first sentence fragment processing result and the second sentence fragment processing result;
integrating the processing results of all template sentence segments and outputting the word segmentation processing results;
and storing the sentence segments and the corresponding template sentence processing results into a database.
7. The method according to claim 1, wherein the populating the historical rules table with the sub-information segments to generate the current rules table according to the revocation labels specifically includes:
filling the change quitting information cell into a historical change quitting rule table according to the distribution change quitting label to generate a preliminary change quitting rule table;
regularizing the sub-information segment contents in the preliminary quit-modification rule table, and acquiring a regularized processing result;
and generating a current quit-change rule table according to the regular processing result.
8. The utility model provides a system for automatically, arrange in order aviation department move back rule of changing which characterized in that includes:
the building module (1) is used for building a text coordinate system according to the text to be processed;
the character module (2) is used for acquiring the text coordinate information of each character in the text to be processed according to the established text coordinate system;
the image module (3) is used for acquiring form coordinate information in the image file according to the image processing result, wherein the form coordinate information comprises contour coordinate information of each form, corresponding cell coordinate information and page number information;
the table module (4) is used for acquiring table coordinate information in the image file according to the image processing result, wherein the table coordinate information comprises contour coordinate information of each table, corresponding cell coordinate information and page number information;
the generating module (5) is used for establishing text frames according to the table coordinate information and the text coordinate information, and each text frame comprises a plurality of cells and corresponding sub information sections;
the distribution module (6) is used for distributing a change-back label for each cell of each text frame according to a preset distribution mark model, wherein the change-back label comprises a main label and a sub label;
the reading module (7) is used for reading a historical change-back rule table, and the historical change-back rule table comprises a plurality of main tags and sub-tags;
a filling module (8) for filling the sub information segment into a historical change-back rule table according to the change-back tag to obtain a current change-back rule table;
and the feedback module (9) is used for feeding back the current change-quit rule table to the administrator terminal.
9. An intelligent terminal, comprising a memory and a processor, the memory having stored thereon a computer program that can be loaded by the processor and that executes the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which can be loaded by a processor and which executes the method of any one of claims 1 to 7.
CN202110037555.3A 2021-01-12 2021-01-12 Method, system, terminal and storage medium for automatically sorting airline driver change-back rules Pending CN112733513A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110037555.3A CN112733513A (en) 2021-01-12 2021-01-12 Method, system, terminal and storage medium for automatically sorting airline driver change-back rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110037555.3A CN112733513A (en) 2021-01-12 2021-01-12 Method, system, terminal and storage medium for automatically sorting airline driver change-back rules

Publications (1)

Publication Number Publication Date
CN112733513A true CN112733513A (en) 2021-04-30

Family

ID=75591443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110037555.3A Pending CN112733513A (en) 2021-01-12 2021-01-12 Method, system, terminal and storage medium for automatically sorting airline driver change-back rules

Country Status (1)

Country Link
CN (1) CN112733513A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487338A (en) * 2021-07-26 2021-10-08 携程商旅信息服务(上海)有限公司 Ticket refunding processing method and system, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487338A (en) * 2021-07-26 2021-10-08 携程商旅信息服务(上海)有限公司 Ticket refunding processing method and system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
JP4343213B2 (en) Document processing apparatus and document processing method
JP3940491B2 (en) Document processing apparatus and document processing method
CN112417873B (en) Automatic cartoon generation method and system based on BBWC model and MCMC
CN114610892A (en) Knowledge point annotation method and device, electronic equipment and computer storage medium
CN115630648A (en) Address element analysis method and system for man-machine conversation and computer readable medium
CN116401376A (en) Knowledge graph construction method and system for manufacturability inspection
CN112269872A (en) Resume analysis method and device, electronic equipment and computer storage medium
CN113468317B (en) Resume screening method, system, equipment and storage medium
CN112733513A (en) Method, system, terminal and storage medium for automatically sorting airline driver change-back rules
CN112988982B (en) Autonomous learning method and system for computer comparison space
JP2004178010A (en) Document processor, its method, and program
JP2006309347A (en) Method, system, and program for extracting keyword from object document
CN115130437B (en) Intelligent document filling method and device and storage medium
CN112528642A (en) Implicit discourse relation automatic identification method and system
KR20110039900A (en) Iamge data recognition and managing method for ancient documents using intelligent recognition library and management tool
CN111783416A (en) Method for constructing document image data set by using prior knowledge
CN110765107A (en) Question type identification method and system based on digital coding
CN115455986A (en) Spanish language place name translation method, device, equipment and medium
CN115116069A (en) Text processing method and device, electronic equipment and storage medium
CN115203415A (en) Resume document information extraction method and related device
CN114238654A (en) Knowledge graph construction method and device and computer readable storage medium
CN113111869A (en) Method and system for extracting text picture and description thereof
CN115995087B (en) Document catalog intelligent generation method and system based on fusion visual information
CN116912867B (en) Teaching material structure extraction method and device combining automatic labeling and recall completion
CN117808923B (en) Image generation method, system, electronic device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination