Disclosure of Invention
The technical problem solved by the disclosure is to provide a text searching method, so as to at least partially solve the technical problem that a filter with a pencil drawing style in the prior art cannot give consideration to both effects and performances. In addition, a text search device, a text search hardware device, a computer readable storage medium and a text search terminal are also provided.
In order to achieve the above object, according to one aspect of the present disclosure, the following technical solutions are provided:
a text search method, comprising:
receiving an input query keyword of a first text type;
carrying out query logic marking processing on the query keywords according to the current search scene, and carrying out word segmentation processing on the marked query keywords;
and searching the text of the second text type corresponding to the query key words according to the query logic and the word segmentation result, and returning the search result.
Further, the query logic labeling processing on the query keyword according to the current search scenario and the word segmentation processing on the labeled query keyword include:
adding characters corresponding to the query logic on the basis of the query key words according to the current search scene to form a query character string;
and performing word segmentation processing on the query character string.
Further, the performing word segmentation processing on the query character string includes:
and inputting the query character string into a word segmentation device, and acquiring at least one word segmentation according to the word segmentation device.
Further, the inputting the query character string into a word segmentation device, and obtaining at least one word segmentation according to the word segmentation device includes:
creating and opening a first word splitter, inputting the query character string into the first word splitter, determining the current operation as a query logic through the first word splitter, and returning a first word splitter;
detecting SQL keywords of the rest query character strings, releasing the first word segmentation device, creating and opening a second word segmentation device, inputting the rest query character strings into the second word segmentation device, determining the current operation as a query logic through the second word segmentation device, and returning a second word segmentation;
and carrying out SQL keyword detection on the rest query character strings, releasing the second word splitter, and continuously executing similar operation until the query character strings are processed to obtain a plurality of word splits.
Further, the searching for the text of the second text type corresponding to the query keyword according to the query logic and the word segmentation result and returning a search result includes:
constructing a query condition according to the multiple participles and the SQL keywords;
and searching the text of the second text type corresponding to the query key words according to the query conditions, and returning a search result.
Further, the query logic is initial word segmentation logic.
Further, the method further comprises:
and performing word segmentation processing on the keywords of the second text type to obtain query keywords of the first text type.
Further, the performing word segmentation processing on the keywords of the second text type to obtain query keywords of the first text type includes:
segmenting keywords in the keywords of the second text type to obtain the initial of each keyword;
and forming the query key words of the first text type by the initial letters of each key word.
Further, the forming the query keyword of the first text type by the initial letter of each keyword includes:
if the keyword is a polyphone, establishing a polyphone table, wherein the polyphone table comprises the corresponding relation between the keyword and each syllable initial;
and inquiring the first letter of the keyword according to the polyphone table, wherein the first letter of each keyword forms at least one group of inquiry keywords of the first text type.
In order to achieve the above object, according to still another aspect of the present disclosure, the following technical solutions are also provided:
a text search apparatus comprising:
the keyword receiving module is used for receiving input query keywords of a first text type;
the word segmentation module is used for carrying out query logic marking processing on the query keywords according to the current search scene and carrying out word segmentation processing on the marked query keywords;
and the searching module is used for searching the text of the second text type corresponding to the query key words according to the query logic and the word segmentation result and returning the search result.
Further, the word segmentation module comprises:
the marking unit is used for adding characters corresponding to the query logic on the basis of the query key words according to the current search scene to form a query character string;
and the word segmentation unit is used for carrying out word segmentation processing on the query character string.
Further, the word segmentation unit is specifically configured to: and inputting the query character string into a word segmentation device, and acquiring at least one word segmentation according to the word segmentation device.
Further, the word segmentation unit is specifically configured to: creating and opening a first word splitter, inputting the query character string into the first word splitter, determining the current operation as a query logic through the first word splitter, and returning a first word splitter; detecting SQL keywords of the rest query character strings, releasing the first word segmentation device, creating and opening a second word segmentation device, inputting the rest query character strings into the second word segmentation device, determining the current operation as a query logic through the second word segmentation device, and returning a second word segmentation; and carrying out SQL keyword detection on the rest query character strings, releasing the second word splitter, and continuously executing similar operation until the query character strings are processed to obtain a plurality of word splits.
Further, the search module is specifically configured to: constructing a query condition according to the multiple participles and the SQL keywords; and searching the text of the second text type corresponding to the query key words according to the query conditions, and returning a search result.
Further, the query logic is initial word segmentation logic.
Further, the word segmentation module is further configured to: and performing word segmentation processing on the keywords of the second text type to obtain query keywords of the first text type.
Further, the word segmentation module is specifically configured to: segmenting keywords in the keywords of the second text type to obtain the initial of each keyword; and forming the query key words of the first text type by the initial letters of each key word.
Further, the word segmentation module is specifically configured to: if the keyword is a polyphone, establishing a polyphone table, wherein the polyphone table comprises the corresponding relation between the keyword and each syllable initial; and inquiring the first letter of the keyword according to the polyphone table, wherein the first letter of each keyword forms at least one group of inquiry keywords of the first text type.
In order to achieve the above object, according to still another aspect of the present disclosure, the following technical solutions are also provided:
an electronic device, comprising:
a memory for storing non-transitory computer readable instructions; and
and the processor is used for operating the computer readable instructions, so that the processor can realize the steps in any text searching method technical scheme when executing.
In order to achieve the above object, according to still another aspect of the present disclosure, the following technical solutions are also provided:
a computer readable storage medium storing non-transitory computer readable instructions which, when executed by a computer, cause the computer to perform the steps recited in any of the text search method aspects above.
In order to achieve the above object, according to still another aspect of the present disclosure, the following technical solutions are also provided:
a text search terminal comprises any one of the text search devices.
According to the method and the device for searching the text, the query logic marking processing is carried out on the query key words of the first text type according to the current search scene, the word segmentation processing is carried out on the marked query key words, the text of the second text type corresponding to the query key words is searched according to the query logic and the word segmentation results, and the search results are returned, so that the technical problems that the similar text search can be carried out only in the prior art, and the previous text cannot be matched if different types of text search are used are solved.
The foregoing is a summary of the present disclosure, and for the purposes of promoting a clear understanding of the technical means of the present disclosure, the present disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
Detailed Description
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
In order to solve the technical problem that only similar text search can be performed and previous texts cannot be matched if different types of text search are used in the prior art, the embodiment of the disclosure provides a text search method. As shown in fig. 1, the text search method mainly includes the following steps S1 to S3. Wherein:
step S1: an input query keyword of a first text type is received.
To distinguish between different text types herein, a first occurring text type is defined herein as a first text type and a subsequent occurring text type is defined herein as a second text type.
Wherein the first text type may be pinyin characters. For example, it may be a pinyin total, or a combination of initials, of the text to be searched.
Step S2: and carrying out query logic marking processing on the query keywords according to the current search scene, and carrying out word segmentation processing on the marked query keywords.
The query logic may indicate that the current scene is a search scene, and the corresponding query logic may be an initial word logic. I.e. indicating that the text content contained in the query keyword is the first letter of the text to be searched.
Specifically, since the types of the input text and the query keyword are different when the index is created, that is, the input text is a name of a chinese character when the index is created, and the query keyword is a pinyin initial, the query keyword needs to be marked to distinguish whether the search scene is present.
Step S3: and searching the text of the second text type corresponding to the query key words according to the query logic and the word segmentation result, and returning the search result.
Wherein the second text type may be a character, such as chinese kanji.
In the embodiment, the query logic marking processing is performed on the query keyword of the first text type according to the current search scene, the word segmentation processing is performed on the marked query keyword, the text of the second text type corresponding to the query keyword is searched according to the query logic and the word segmentation result, and the search result is returned, so that the technical problems that in the prior art, only the same type of text search can be performed, and if different types of text search are used, the previous text cannot be matched are solved.
In an optional embodiment, step S1 specifically includes:
step S11: and adding characters corresponding to the query logic on the basis of the query key words according to the current search scene to form a query character string.
Specifically, a magic string indicating a search scene may be concatenated after a keyword is queried, thereby distinguishing a search. For example, & QUERY may be used here as the magic string. If the QUERY keyword is xq, then the resulting QUERY string is xq & QUERY.
Step S12: and performing word segmentation processing on the query character string.
In an optional embodiment, step S12 specifically includes:
and inputting the query character string into a word segmentation device, and acquiring at least one word segmentation according to the word segmentation device.
For example, if the QUERY string is xq & QUERY, then the resulting participles obtained by the participler are x and q.
Further, the inputting the query character string into a word segmentation device, and obtaining at least one word segmentation according to the word segmentation device includes:
creating and opening a first word splitter, inputting the query character string into the first word splitter, determining the current operation as a query logic through the first word splitter, and returning a first word splitter;
detecting SQL keywords of the rest query character strings, releasing the first word segmentation device, creating and opening a second word segmentation device, inputting the rest query character strings into the second word segmentation device, determining the current operation as a query logic through the second word segmentation device, and returning a second word segmentation;
and carrying out SQL keyword detection on the rest query character strings, releasing the second word splitter, and continuously executing similar operation until the query character strings are processed to obtain a plurality of word splits.
The SQL keyword may be a word indicating connection relation of characters, such as OR, AND, NEAR, OR the like.
Specifically, in this embodiment, the implementation may be performed by writing five function pointers, where the five function pointers are respectively as follows:
create, for creating a tokenizer instance.
Destore, destroys one previously created tokenizer instance.
Open, a query string is called when beginning to perform word segmentation, for corresponding preparation work.
And Next, returning the Next word segmentation of the current query character string, and recording the position of the segmented word inside the word segmentation device.
Close, ending the participle for the current query string.
For example, if the QUERY string is xq & QUERY, and the QUERY is seen in the segmenter and then considered as a search scene, the initial segmentation logic should be performed, and important events in the search are as follows:
create and Open a new word segmentation device, and introduce xq & QUERY, Next is called, the word segmentation device finds that this is a search, carry on the logic of the first letter word segmentation, then return x; because the participle is returned, the QUERY string is truncated to the position of the immediately returned participle, and q & QUERY is left; detecting SQL keywords;
the last word segmentation device of the Close, namely the last word segmentation device is released;
create and Open a new word segmentation device, and transmit q & QUERY;
next is called, where q is returned, since Token was returned, the string is truncated to the position of Token just returned, leaving & QUERY, detecting the SQL key;
the last word segmentation device of the Close, namely the last word segmentation device is released;
create and Open a new word segmentation device and transfer & QUERY in;
next is called, no word segmentation is returned;
and finishing the processing of the query character string, and finally returning the participles as x and q.
In an optional embodiment, step S3 specifically includes:
step S31: and constructing a query condition according to the multiple participles and the SQL keywords.
Step S32: and searching the text of the second text type corresponding to the query key words according to the query conditions, and returning a search result.
Specifically, taking the QUERY keyword xq & QUERY as an example, the QUERY condition is actually changed into a record containing the participle x and the participle q, so that the corresponding text of the second text type, such as chinese characters, can be found.
In an optional embodiment, the method of the present disclosure further comprises:
and performing word segmentation processing on the keywords of the second text type to obtain query keywords of the first text type.
The second text type is a character, for example, a chinese character, and the first text type is a corresponding pinyin, for example, a first letter or a full pinyin of the chinese character.
In an optional embodiment, the performing a word segmentation process on the keyword of the second text type to obtain a query keyword of the first text type includes:
segmenting keywords in the keywords of the second text type to obtain the initial of each keyword;
and forming the query key words of the first text type by the initial letters of each key word.
For example, if the keyword of the second text type is small and clear, the query keyword of the first text type is xm after the word segmentation processing.
In an alternative embodiment, the forming the query keyword of the first text type by the initial letter of each keyword includes:
if the keyword is a polyphone, establishing a polyphone table, wherein the polyphone table comprises the corresponding relation between the keyword and each syllable initial;
and inquiring the first letter of the keyword according to the polyphone table, wherein the first letter of each keyword forms at least one group of inquiry keywords of the first text type.
For example, if the keyword of the second text type is xiaoqian, madder is a polyphone, which may be qian OR xi, a polyphone table corresponding to the keyword madder is constructed as q and x, and the query keyword of the first text type corresponding to the xiaoqian may be obtained as xq OR xx by querying the polyphone table.
It will be appreciated by those skilled in the art that obvious modifications (e.g., combinations of the enumerated modes) or equivalents may be made to the above-described embodiments.
In the above, although the steps in the text search method embodiment are described in the above sequence, it should be clear to those skilled in the art that the steps in the embodiment of the present disclosure are not necessarily performed in the above sequence, and may also be performed in other sequences such as reverse sequence, parallel sequence, cross sequence, etc., and further, on the basis of the above steps, those skilled in the art may also add other steps, and these obvious modifications or equivalents should also be included in the protection scope of the present disclosure, and are not described herein again.
For convenience of description, only the relevant parts of the embodiments of the present disclosure are shown, and details of the specific techniques are not disclosed, please refer to the embodiments of the method of the present disclosure.
In order to solve the technical problem that erroneous judgment is easily caused by simultaneous blinking of two eyes in the prior art, the embodiment of the disclosure provides a text search device. The apparatus may perform the steps in the text search method embodiments described above. As shown in fig. 2, the apparatus mainly includes: the keyword receiving module 21, the word segmentation module blink 22 and the search module 23; wherein, word segmentation module
The keyword receiving module 21 is configured to receive an input query keyword of a first text type;
the word segmentation module 22 is configured to perform query logic labeling processing on the query keyword according to the current search scenario, and perform word segmentation processing on the labeled query keyword;
the search module 23 is configured to search for a text of the second text type corresponding to the query keyword according to the query logic and the word segmentation result, and return a search result.
Further, the word segmentation module 22 includes: a marking unit 221 and a word segmentation unit 222; wherein the content of the first and second substances,
the marking unit 221 is configured to add characters corresponding to the query logic on the basis of the query keyword according to the current search scenario to form a query character string;
the word segmentation unit 222 is configured to perform word segmentation processing on the query string.
Further, the word segmentation unit 222 is specifically configured to: and inputting the query character string into a word segmentation device, and acquiring at least one word segmentation according to the word segmentation device.
Further, the word segmentation unit 222 is specifically configured to: creating and opening a first word splitter, inputting the query character string into the first word splitter, determining the current operation as a query logic through the first word splitter, and returning a first word splitter; detecting SQL keywords of the rest query character strings, releasing the first word segmentation device, creating and opening a second word segmentation device, inputting the rest query character strings into the second word segmentation device, determining the current operation as a query logic through the second word segmentation device, and returning a second word segmentation; and carrying out SQL keyword detection on the rest query character strings, releasing the second word splitter, and continuously executing similar operation until the query character strings are processed to obtain a plurality of word splits.
Further, the search module 23 is specifically configured to: constructing a query condition according to the multiple participles and the SQL keywords; and searching the text of the second text type corresponding to the query key words according to the query conditions, and returning a search result.
Further, the query logic is initial word segmentation logic.
Further, the word segmentation module 22 is further configured to: and performing word segmentation processing on the keywords of the second text type to obtain query keywords of the first text type.
Further, the word segmentation module 22 is specifically configured to: segmenting keywords in the keywords of the second text type to obtain the initial of each keyword; and forming the query key words of the first text type by the initial letters of each key word.
Further, the word segmentation module 22 is specifically configured to: if the keyword is a polyphone, establishing a polyphone table, wherein the polyphone table comprises the corresponding relation between the keyword and each syllable initial; and inquiring the first letter of the keyword according to the polyphone table, wherein the first letter of each keyword forms at least one group of inquiry keywords of the first text type.
For detailed descriptions of the working principle, the realized technical effect, and the like of the embodiment of the text search device, reference may be made to the related descriptions in the foregoing text search method embodiment, and further description is omitted here.
Referring now to FIG. 3, shown is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 3, the electronic device may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage device 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 309, or installed from the storage means 308, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an input query keyword of a first text type; carrying out query logic marking processing on the query keywords according to the current search scene, and carrying out word segmentation processing on the marked query keywords; and searching the text of the second text type corresponding to the query key words according to the query logic and the word segmentation result, and returning the search result.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an input query keyword of a first text type; carrying out query logic marking processing on the query keywords according to the current search scene, and carrying out word segmentation processing on the marked query keywords; and searching the text of the second text type corresponding to the query key words according to the query logic and the word segmentation result, and returning the search result.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.