CN117235345B - Open format document OFD searching method and device and electronic equipment - Google Patents
Open format document OFD searching method and device and electronic equipment Download PDFInfo
- Publication number
- CN117235345B CN117235345B CN202311527407.5A CN202311527407A CN117235345B CN 117235345 B CN117235345 B CN 117235345B CN 202311527407 A CN202311527407 A CN 202311527407A CN 117235345 B CN117235345 B CN 117235345B
- Authority
- CN
- China
- Prior art keywords
- coding
- value
- target character
- search
- ofd
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000004590 computer program Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 101000695861 Arabidopsis thaliana Brefeldin A-inhibited guanine nucleotide-exchange protein 5 Proteins 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an open format document OFD searching method, an open format document OFD searching device, electronic equipment and a storage medium, and relates to the technical field of computers, wherein the method comprises the following steps: acquiring a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between a first encoding value and a second encoding value of each of the plurality of characters. Through the first coding value and the coding table of the target characters, secondary searching of the OFD document can be automatically realized, and searching accuracy and efficiency of the OFD document are improved.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an open format document OFD searching method, apparatus, and electronic device.
Background
An Open-Layout Document (OFD) is an electronic Document format that is developed autonomously, and has been widely used in China, especially in the fields of finance, insurance, government, etc.
In the OFD document searching technology, searching is carried out according to Unicode code values of all texts in document contents during text searching according to Unicode code values of input texts, and if the Unicode code values are matched with corresponding Unicode code values, position information of the input texts in the documents is returned. However, the text search of an OFD document typically uses Unicode code values corresponding to the text, which can present a problem that cannot be found when there are two different Unicode codes for characters of the same glyph.
Disclosure of Invention
The invention provides an open format document OFD searching method, device and electronic equipment, which are used for solving the problem that the open format document OFD searching method, device and electronic equipment cannot be searched when two different Unicode codes exist for characters of the same font in the prior art.
The invention provides an OFD searching method for open format documents, which comprises the following steps:
acquiring a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character;
searching the OFD document based on the first coding value of the target character to obtain a first search result;
under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
According to the open format document OFD searching method provided by the present invention, the second searching is performed on the OFD document based on the first coding value and the coding table of the target character, so as to obtain a second searching result, which includes:
determining whether the first encoded value of the target character exists in the encoding table;
replacing the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table when the first code value exists in the code table;
and carrying out secondary search on the OFD document based on the second coding value of the target character to obtain the second search result.
According to the open format document OFD searching method provided by the present invention, the second searching for the OFD document based on the second coding value of the target character, to obtain the second searching result, includes:
matching the second coding values of the target characters with third coding values of all characters corresponding to the OFD document respectively;
under the condition that the second code value is successfully matched with the third code value of any character corresponding to the OFD document, determining that the second search result is successful in search;
and under the condition that the second code value is failed to match with the third code value of all the characters corresponding to the OFD document, determining the second search result as search failure.
According to the open format document OFD searching method provided by the present invention, the determining whether the first coding value of the target character exists in the coding table includes:
searching the target character in the coding table;
determining that the first coding value of the target character exists in the coding table under the condition that the target character is found;
in the case that the target character is not found, determining that the first encoded value of the target character does not exist in the encoding table.
According to the open format document OFD searching method provided by the invention, the OFD document is searched based on the first coding value of the target character to obtain a first searching result, and the method comprises the following steps:
matching the first coding values of the target characters with the third coding values of all characters corresponding to the OFD document respectively;
under the condition that the first coding value of the target character is failed to be matched with the third coding values of all characters corresponding to the OFD document, determining that the first search result is failed in search;
and under the condition that the first coding value of the target character is successfully matched with the third coding value of any character corresponding to the OFD document, determining that the first search result is successful in search.
According to the method for searching the open format document OFD provided by the invention, the character set consisting of a plurality of characters included in the coding table is a character set of Kangxi radicals, the first coding value is a Chinese character coding value, and the second coding value is a Kangxi radical coding value.
The invention also provides an OFD searching device for the open format document, which comprises the following steps:
the acquisition module is used for acquiring a search request of target characters in the OFD document; the search request is used for indicating a first code value of the target character;
the first search module is used for searching the OFD document based on the first coding value of the target character to obtain a first search result;
the second search module is used for carrying out secondary search on the OFD document based on the first coding value and the coding table of the target character under the condition that the first search result is search failure, so as to obtain a second search result; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the open format document OFD searching method is realized by the processor when the program is executed.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements an open layout document OFD search method as described in any one of the above.
According to the open format document OFD searching method, the open format document OFD searching device, the electronic equipment and the storage medium, a searching request for target characters in the OFD document is obtained; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between a first encoding value and a second encoding value of each of the plurality of characters. Because the encoding table comprises the association relation between the first encoding value and the second encoding value of each character in the plurality of characters, the secondary search of the OFD document can be automatically realized through the first encoding value and the encoding table of the target character, and the searching accuracy and efficiency of the OFD document are improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an open layout document OFD searching method provided by the invention;
FIG. 2 is a schematic diagram of code values corresponding to characters in different coding modes according to the present invention;
FIG. 3 is a second flow chart of the open layout document OFD searching method provided by the invention;
FIG. 4 is a schematic structural diagram of an open layout document OFD searching device provided by the invention;
fig. 5 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
To facilitate a clearer understanding of the various embodiments of the present application, some relevant knowledge is presented.
Unicode (Unicode) is a standard coding system for representing text characters. It defines a unique identifier for each character in the character set, called a codepoint. Unicode uses hexadecimal numbers to represent the code of each character, typically beginning with the prefix "U+" followed by a four to eight-bit hexadecimal number, e.g., the Unicode code point for Latin capital "A" is U+0041 and the Unicode code point for Chinese "o" is U+554A.
The Unicode is only responsible for assigning unique codes to characters, and the specific display effect of the characters is not related, and in practice, it may happen that one character corresponds to two different Unicode codes, and the actual display effect of the characters of the two different Unicode codes is the same, for example, the character "two" corresponds to two different Unicode codes (u+2f06 and u+4e8c), and the actual display effect of the character "two" corresponds to each of u+2f06 and u+4e8c.
User Interface (UI) based searches are typically based on visual effects. For characters having the same display effect, there is a possibility that Unicode code values of the search request and Unicode code values stored in the OFD document are not identical. In this case, a search failure is returned, which may cause a trouble in understanding. Even if the user realizes that this is due to search failure caused by inconsistency of Unicode code values, additional work is required for the ordinary user to find that the character corresponds to another Unicode code, and another Unicode code is input in the input method, which also requires additional support of the input method. In general, input methods support only one Unicode code.
In actual user searching, text searching of the OFD document usually uses Unicode code values corresponding to the text, which can cause a problem that the text cannot be searched when two different Unicode codes exist for characters of the same font, for example, assuming that a Unicode code value stored by a character 'two' in the OFD document is u+2f06, a search failure result is obtained when the user uses u+4e8c for searching.
Based on the problems, the invention provides an open format document OFD searching method, secondary searching of the OFD document can be automatically realized through the first coding value and the coding table of the target character, and the searching accuracy and efficiency of the OFD document are improved.
The open layout document OFD search method of the present invention is described below with reference to fig. 1 to 3.
FIG. 1 is one of the flow diagrams of the open format document OFD searching method provided by the invention, and as shown in FIG. 1, the method comprises steps 101-103; wherein,
step 101, obtaining a search request for target characters in an OFD document; the search request is to indicate a first encoded value of the target character.
It should be noted that, the method for searching the open format document OFD provided by the present invention is suitable for a scenario of searching the content in the OFD document, and the execution body of the method may be an open format document OFD searching device, for example, an electronic apparatus, or a control module for executing the method for searching the open format document OFD in the open format document OFD searching device.
Specifically, when a user needs to search for content in an OFD document, the user opens the OFD document by using an OFD document reader (e.g., a Fuxin reader), and triggers a search request for target characters in the OFD document reader through an input method; the search request is used to indicate a first code value of the target character, where the first code value may be a Unicode code value, for example, the target character is "two", and the first code value is "0x4E8C".
And 102, searching the OFD document based on the first coding value of the target character to obtain a first search result.
Specifically, according to a first coding value of the target character, searching the OFD document, and obtaining a first search result; the first search result is successful or failed.
Step 103, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result under the condition that the first search result is search failure; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
It should be noted that, for different coding modes, each character corresponds to different coding values, some characters have a one-to-one association relationship, and some characters have a one-to-many association relationship.
Fig. 2 is a schematic diagram of code values corresponding to characters in different coding modes provided by the invention, as shown in fig. 2, the display effect of the kanji characters "mu" is the same, the coding modes include GB2312, BIG5, GBK, GB18030, UTF-8, UTF-16BE, UTE-16LE and Unicode, but for different coding modes, it can BE seen that the two "mu" are different in the coding modes, whether the coding values corresponding to the coding modes are GB18030, UTF-8, UTF-16BE, UTF-16LE or Unicode codes, and from the aspect of program codes, the two completely different characters are corresponding to two different code values.
Specifically, the encoding table includes an association relationship between a first encoding value and a second encoding value of each of a plurality of characters, that is, one character corresponds to the first encoding value and the second encoding value.
Table 1 is a coding table provided by the present invention, as shown in Table 1, tab [ U1 ]]=[U2]Wherein U1 and U2 are encoded values of Unicode, e.g., tab [0x9FDA0]=0x2fd 5, i.e. 0x9FDA0 and 0x2FD5 correspond to the same character'”。
TABLE 1 coding table
The character set composed of a plurality of characters included in the coding table is a character set of Kangxi radicals, the first coding value is a Chinese character coding value, and the second coding value is a Kangxi radical coding value. Kangxi radicals refer to a radical system used in Kangxi dictionary and comprise twenty-four radicals in total, and the radicals are arranged in Unicode ranges of [ U+2F00, U+2FDF ], but the actual ranges of the radicals are [ U+2F00, U+2FD5] because the radicals actually recorded in Kangxi dictionary are only twenty-four. Usually, the coding of the input characters by the input method is less than direct coding of Kangxi radicals, and the coding table considers the mapping from the coding of the Chinese characters of the common Chinese characters to the coding of the Kangxi radicals.
In practice, under the condition that the first search result is a search failure, according to the first coding value and the coding table of the target character, performing secondary search on the OFD document to obtain a second search result; the second search result is successful or failed, and the position information of the target character in the OFD document can be directly searched through secondary search under the condition that the second search result is successful; and under the condition that the second search result is search failure, the fact that target characters needing to be searched do not exist in the OFD document is indicated. The secondary search of the OFD document is realized through the coding table, and the searching accuracy and efficiency of the OFD document are improved.
Optionally, if the first search result is that the search is successful, determining the first coding value according to the target character can directly search the position information of the target character in the OFD document.
According to the open format document OFD searching method provided by the invention, the searching request of target characters in the OFD document is obtained; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between a first encoding value and a second encoding value of each of the plurality of characters. Because the encoding table comprises the association relation between the first encoding value and the second encoding value of each character in the plurality of characters, the secondary search of the OFD document is automatically realized through the first encoding value and the encoding table of the target character, and the search accuracy and the search efficiency of the OFD document are improved.
Optionally, the specific implementation manner of step 102 includes:
(1) And respectively matching the first coding values of the target characters with the third coding values of all the characters corresponding to the OFD document.
It should be noted that, when the OFD document is stored in the hard disk, each character in the OFD document is encoded, that is, each character corresponds to a third encoded value, for example, the third encoded value is a Unicode encoded value.
Specifically, the first code value of the target character input by the user can be respectively matched with the third code values of all the characters corresponding to the OFD document.
(2) And under the condition that the first coding value of the target character is failed to match with the third coding values of all characters corresponding to the OFD document, determining that the first search result is failed in search.
Specifically, under the condition that the first code value of the target character is failed to match with the third code value of all the characters corresponding to the OFD document, the fact that the characters which are not matched with the target character in all the characters corresponding to the OFD document are explained can determine that the first search result is failed in search.
(3) And under the condition that the first code value of the target character is successfully matched with the third code value of any character corresponding to the OFD document, determining that the first search result is successful in search.
Specifically, under the condition that the first code value of the target character is successfully matched with the third code value of any character in all characters corresponding to the OFD document, the fact that the characters matched with the target character exist in all characters corresponding to the OFD document is indicated, and the first search result can be determined to be successful in search.
Optionally, the specific implementation manner of step 103 includes:
(a) Determining whether the first encoded value of the target character is present in the encoding table.
(b) And replacing the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table when the first code value exists in the code table.
Specifically, in the case where the first code value of the target character exists in the code table, the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table may be replaced, that is, the second code value is regarded as the code value of the target character.
(c) And carrying out secondary search on the OFD document based on the second coding value of the target character to obtain the second search result.
Specifically, according to the second coding value of the target character, the OFD document can be searched for a second time, and a second search result is obtained.
Optionally, the determining whether the first encoded value of the target character exists in the encoding table includes:
searching the target character in the coding table; determining that the first coding value of the target character exists in the coding table under the condition that the target character is found; in the case that the target character is not found, determining that the first encoded value of the target character does not exist in the encoding table.
Specifically, searching a target character in the coding table, namely matching the target character with all characters in the coding table, and determining that the target character is searched in the coding table under the condition that the target character is matched with any character in the coding table; in the event that the target character does not match all of the characters in the encoding table, it is determined that the target character is not found in the encoding table. Under the condition that the target character is found, determining that the first coding value of the target character exists in a coding table; in the case that the target character is not found, it is determined that the first code value of the target character does not exist in the code table.
Optionally, the performing a secondary search on the OFD document based on the second code value of the target character to obtain the second search result includes:
matching the second coding values of the target characters with third coding values of all characters corresponding to the OFD document respectively; under the condition that the second code value is successfully matched with the third code value of any character corresponding to the OFD document, determining that the second search result is successful in search; and under the condition that the second code value is failed to match with the third code value of all the characters corresponding to the OFD document, determining the second search result as search failure.
Specifically, the second coding value of the target character is respectively matched with the third coding values of all the characters corresponding to the OFD document, and when the second coding value is successfully matched with the third coding value of any one of the characters corresponding to the OFD document, the fact that the characters matched with the target character exist in all the characters corresponding to the OFD document is indicated, the second search result is determined to be successful in search, and further the fact that the target character input by the user exists in the OFD document is indicated. And under the condition that the second code value is not matched with the third code value of all the characters corresponding to the OFD document, indicating that the characters which are not matched with the target characters in all the characters corresponding to the OFD document are not matched, determining that the second search result is the search failure, and indicating that the target characters input by the user are not in the OFD document.
FIG. 3 is a second flow chart of the open format document OFD searching method provided by the invention, as shown in FIG. 3, the method comprises steps 301-309; wherein,
step 301, obtaining a search request for a target character in the OFD document, wherein the search request is used for indicating a first coding value of the target character.
Step 302, judging whether the first code values of the target characters are matched with the third code values of all characters corresponding to the OFD document or not; if the first code value of the target character is successfully matched with the third code value of any character in all characters corresponding to the OFD document, the step 303 is shifted to; if the first code value of the target character fails to match the third code values of all the characters corresponding to the OFD document, the process proceeds to step 304.
Step 303, determining the first search result as successful search.
Step 304, determining the first search result as a search failure.
Step 305, determining whether the first code value of the target character exists in the code table. If the first encoded value is present in the encoding table, then the process proceeds to step 306; in case the first encoded value is not present in the encoding table, the process proceeds to step 309.
And 306, replacing the first code value of the target character with a second code value corresponding to the first code value of the target character in the code table.
Step 307, it is determined whether the second code values of the target characters are all failed to match the third code values of all the characters corresponding to the OFD document. If the second code value is successfully matched with the third code value of any character corresponding to the OFD document, the step 308 is shifted to; in case the second code value fails to match the third code value of all characters corresponding to the OFD document, go to step 309.
Step 308, determining the second search result as successful.
Step 309, determining the second search result as a search failure.
The open format document OFD searching device provided by the invention is described below, and the open format document OFD searching device described below and the open format document OFD searching method described above can be referred to correspondingly.
Fig. 4 is a schematic structural diagram of an open-format document OFD searching device provided by the present invention, and as shown in fig. 4, an open-format document OFD searching 400 includes an obtaining module 401, a first searching module 402, and a second searching module 403; wherein,
an obtaining module 401, configured to obtain a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character;
a first search module 402, configured to search the OFD document based on the first code value of the target character, to obtain a first search result;
a second search module 403, configured to perform a second search on the OFD document based on the first coding value and the coding table of the target character, to obtain a second search result if the first search result is a search failure; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
According to the open format document OFD searching device, the searching request for the target characters in the OFD document is obtained; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between a first encoding value and a second encoding value of each of the plurality of characters. Because the encoding table comprises the association relation between the first encoding value and the second encoding value of each character in the plurality of characters, the secondary search of the OFD document is automatically realized through the first encoding value and the encoding table of the target character, and the search accuracy and the search efficiency of the OFD document are improved.
Optionally, the second search module 403 is specifically configured to:
determining whether the first encoded value of the target character exists in the encoding table;
replacing the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table when the first code value exists in the code table;
and carrying out secondary search on the OFD document based on the second coding value of the target character to obtain the second search result.
Optionally, the second search module 403 is further configured to:
matching the second coding values of the target characters with third coding values of all characters corresponding to the OFD document respectively;
under the condition that the second code value is successfully matched with the third code value of any character corresponding to the OFD document, determining that the second search result is successful in search;
and under the condition that the second code value is failed to match with the third code value of all the characters corresponding to the OFD document, determining the second search result as search failure.
Optionally, the second search module 403 is further configured to:
searching the target character in the coding table;
determining that the first coding value of the target character exists in the coding table under the condition that the target character is found;
in the case that the target character is not found, determining that the first encoded value of the target character does not exist in the encoding table.
Optionally, the first search module 402 is specifically configured to:
matching the first coding values of the target characters with the third coding values of all characters corresponding to the OFD document respectively;
under the condition that the first coding value of the target character is failed to be matched with the third coding values of all characters corresponding to the OFD document, determining that the first search result is failed in search;
and under the condition that the first coding value of the target character is successfully matched with the third coding value of any character corresponding to the OFD document, determining that the first search result is successful in search.
Optionally, the character set formed by a plurality of characters included in the coding table is a character set of kangxi radicals, the first coding value is a Chinese character coding value, and the second coding value is a kangxi radical coding value.
Fig. 5 is a schematic physical structure of an electronic device according to the present invention, as shown in fig. 5, the electronic device may include: processor 510, communication interface (Communications Interface) 520, memory 530, and communication bus 540, wherein processor 510, communication interface 520, memory 530 complete communication with each other through communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform an open layout document OFD search method comprising: acquiring a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
Further, the logic instructions in the memory 530 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can execute an open format document OFD searching method provided by the above methods, and the method includes: acquiring a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character; searching the OFD document based on the first coding value of the target character to obtain a first search result; under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table includes an association relationship between the first encoding value and a second encoding value for each of a plurality of characters.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (6)
1. The open format document OFD searching method is characterized by comprising the following steps:
acquiring a search request for a target character in an OFD document; the search request is used for indicating a first code value of the target character; the target character has two different coding values;
searching the OFD document based on the first coding value of the target character to obtain a first search result;
under the condition that the first search result is a search failure, performing secondary search on the OFD document based on the first coding value and the coding table of the target character to obtain a second search result; the encoding table comprises an association relationship between the first encoding value and the second encoding value of each character in a plurality of characters; the character set formed by a plurality of characters included in the coding table is a character set of a Kangxi radical, the first coding value is a Chinese character coding value, and the second coding value is a Kangxi radical coding value; the coding table represents the mapping from the Chinese character codes of the common Chinese characters to the Kangxi radical codes;
the second searching for the OFD document based on the first coding value and the coding table of the target character to obtain a second searching result comprises the following steps:
determining whether the first encoded value of the target character exists in the encoding table;
replacing the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table when the first code value exists in the code table;
performing secondary search on the OFD document based on the second coding value of the target character to obtain a second search result;
said determining whether said first encoded value of said target character is present in said encoding table comprises:
searching the target character in the coding table; matching the target character with all characters in the coding table, and determining that the target character is found in the coding table under the condition that the target character is matched with any character in the coding table; under the condition that the target character is not matched with all characters in the coding table, determining that the target character is not found in the coding table;
determining that the first coding value of the target character exists in the coding table under the condition that the target character is found;
in the case that the target character is not found, determining that the first encoded value of the target character does not exist in the encoding table.
2. The open-layout document OFD search method according to claim 1, wherein the performing a secondary search on the OFD document based on the second code value of the target character to obtain the second search result includes:
matching the second coding values of the target characters with third coding values of all characters corresponding to the OFD document respectively;
under the condition that the second code value is successfully matched with the third code value of any character corresponding to the OFD document, determining that the second search result is successful in search;
and under the condition that the second code value is failed to match with the third code value of all the characters corresponding to the OFD document, determining the second search result as search failure.
3. The method for searching the open layout document OFD according to claim 2, wherein searching the OFD document based on the first code value of the target character to obtain a first search result comprises:
matching the first coding values of the target characters with the third coding values of all characters corresponding to the OFD document respectively;
under the condition that the first coding value of the target character is failed to be matched with the third coding values of all characters corresponding to the OFD document, determining that the first search result is failed in search;
and under the condition that the first code value of the target character is successfully matched with the third code value of any character corresponding to the OFD document, determining that the first search result is successful in search.
4. An open-format document OFD search apparatus, comprising:
the acquisition module is used for acquiring a search request of target characters in the OFD document; the search request is used for indicating a first code value of the target character; the target character has two different coding values;
the first search module is used for searching the OFD document based on the first coding value of the target character to obtain a first search result;
the second search module is used for carrying out secondary search on the OFD document based on the first coding value and the coding table of the target character under the condition that the first search result is search failure, so as to obtain a second search result; the encoding table comprises an association relationship between the first encoding value and the second encoding value of each character in a plurality of characters; the character set formed by a plurality of characters included in the coding table is a character set of a Kangxi radical, the first coding value is a Chinese character coding value, and the second coding value is a Kangxi radical coding value; the coding table represents the mapping from the Chinese character codes of the common Chinese characters to the Kangxi radical codes;
the second search module is specifically configured to:
determining whether the first encoded value of the target character exists in the encoding table;
replacing the first code value of the target character and the second code value corresponding to the first code value of the target character in the code table when the first code value exists in the code table;
performing secondary search on the OFD document based on the second coding value of the target character to obtain a second search result;
the second search module is further configured to:
searching the target character in the coding table; matching the target character with all characters in the coding table, and determining that the target character is found in the coding table under the condition that the target character is matched with any character in the coding table; under the condition that the target character is not matched with all characters in the coding table, determining that the target character is not found in the coding table;
determining that the first coding value of the target character exists in the coding table under the condition that the target character is found;
in the case that the target character is not found, determining that the first encoded value of the target character does not exist in the encoding table.
5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the open-layout document OFD search method according to any one of claims 1 to 3 when executing the program.
6. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the open-layout document OFD search method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311527407.5A CN117235345B (en) | 2023-11-16 | 2023-11-16 | Open format document OFD searching method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311527407.5A CN117235345B (en) | 2023-11-16 | 2023-11-16 | Open format document OFD searching method and device and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117235345A CN117235345A (en) | 2023-12-15 |
CN117235345B true CN117235345B (en) | 2024-03-26 |
Family
ID=89093456
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311527407.5A Active CN117235345B (en) | 2023-11-16 | 2023-11-16 | Open format document OFD searching method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117235345B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101630315A (en) * | 2008-07-16 | 2010-01-20 | 清华大学 | Quick retrieval method and system |
CN104679871A (en) * | 2015-03-06 | 2015-06-03 | 北京语言大学 | Chinese text searching method and Chinese text searching device |
CN107577755A (en) * | 2017-08-31 | 2018-01-12 | 江西博瑞彤芸科技有限公司 | A kind of searching method |
CN110489603A (en) * | 2019-07-30 | 2019-11-22 | 东软集团股份有限公司 | A kind of method for information retrieval, device and vehicle device |
CN111581228A (en) * | 2019-02-15 | 2020-08-25 | 北京无限光场科技有限公司 | Search method and device for correcting search condition, storage medium and electronic equipment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12118292B2 (en) * | 2021-06-27 | 2024-10-15 | John Zhongqi Wang | Method and device for sorting Chinese characters, searching Chinese characters and constructing dictionary |
-
2023
- 2023-11-16 CN CN202311527407.5A patent/CN117235345B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101630315A (en) * | 2008-07-16 | 2010-01-20 | 清华大学 | Quick retrieval method and system |
CN104679871A (en) * | 2015-03-06 | 2015-06-03 | 北京语言大学 | Chinese text searching method and Chinese text searching device |
CN107577755A (en) * | 2017-08-31 | 2018-01-12 | 江西博瑞彤芸科技有限公司 | A kind of searching method |
CN111581228A (en) * | 2019-02-15 | 2020-08-25 | 北京无限光场科技有限公司 | Search method and device for correcting search condition, storage medium and electronic equipment |
CN110489603A (en) * | 2019-07-30 | 2019-11-22 | 东软集团股份有限公司 | A kind of method for information retrieval, device and vehicle device |
Also Published As
Publication number | Publication date |
---|---|
CN117235345A (en) | 2023-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1416394B1 (en) | Method for selecting a font | |
CN111177184A (en) | Structured query language conversion method based on natural language and related equipment thereof | |
CN107145481B (en) | Electronic equipment, storage medium, and method and device for filling webpage form | |
US10402474B2 (en) | Keyboard input corresponding to multiple languages | |
CN111144391B (en) | OCR recognition result error correction method and device | |
TWI567569B (en) | Natural language processing systems, natural language processing methods, and natural language processing programs | |
JP2001043212A (en) | Method for normalizing character information in electronic document | |
US10643022B2 (en) | PDF extraction with text-based key | |
CN118313347A (en) | Document processing method and device and related products | |
US10176392B2 (en) | Optical character recognition | |
CN104933030A (en) | Uygur language spelling examination method and device | |
CN117235345B (en) | Open format document OFD searching method and device and electronic equipment | |
CN116306498B (en) | Text rendering method and device | |
CN110543641B (en) | Chinese and foreign language information comparison method and device | |
EP3719676A1 (en) | Language processing method and device | |
JP6568968B2 (en) | Document review device and program | |
CN116263767A (en) | Database table generation method and system | |
CN111158805B (en) | Delphi software source language translation system, method, equipment and medium | |
CN110991151B (en) | File processing method, device, electronic equipment and computer readable storage medium | |
Jasur et al. | Personal names spell-checking–a study related to Uzbek | |
US7962849B2 (en) | Processing of user character inputs having whitespace | |
CN106326209B (en) | Tibetan character error detection method and system and Tibetan character string error detection method and system | |
CN117151041B (en) | PDF (Portable document Format) generation method, device, equipment and storage medium compatible with rarely used words | |
CA3022045C (en) | Braille editting method using error output function, recording medium storing program for executing same, and computer program stored in recording medium for executing same | |
CN116070644A (en) | Auxiliary translation method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |