GB2572542A - System and method for providing suggestions for completing user query - Google Patents

System and method for providing suggestions for completing user query Download PDF

Info

Publication number
GB2572542A
GB2572542A GB1804910.6A GB201804910A GB2572542A GB 2572542 A GB2572542 A GB 2572542A GB 201804910 A GB201804910 A GB 201804910A GB 2572542 A GB2572542 A GB 2572542A
Authority
GB
United Kingdom
Prior art keywords
entity
name
suggestion
user query
entity type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1804910.6A
Other versions
GB201804910D0 (en
Inventor
Agarwal Vatsal
Bolla Abhilash
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Innoplexus AG
Original Assignee
Innoplexus AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innoplexus AG filed Critical Innoplexus AG
Priority to GB1804910.6A priority Critical patent/GB2572542A/en
Publication of GB201804910D0 publication Critical patent/GB201804910D0/en
Publication of GB2572542A publication Critical patent/GB2572542A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90324Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

System and a method for providing at least one suggestion for completing a user query. The system comprises a database arrangement operable to store a name corpus and an ontology and a processing module communicably coupled to the database arrangement. The processing module is operable to receive the user query having at least one entity unit, identify an entity type of the at least one entity unit based on the ontology, wherein an entity unit having a predefined signature is identified as a name entity type, determine the at least one suggestion for the entity unit, having the name entity type, using the name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure and providing the at least one suggestion for completing the user query. The predefined signature may comprise at least one predefined character within the at least one entity unit. The name corpus may comprise names of authors. The processing module may be further operable to provide affiliations related to the at least one suggestion for the entity units associated with the name entity type.

Description

SYSTEM AND METHOD FOR PROVIDING SUGGESTIONS FOR COMPLETING USER QUERY
TECHNICAL FIELD
The present disclosure relates generally to data processing; and more specifically, to systems that provides suggestions for completing user queries. Furthermore, the present disclosure relates to methods for providing suggestions for completing user queries. Moreover, the present disclosure also relates to computer readable medium containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps of providing suggestions for completing user query.
BACKGROUND
In recent decades, the domain around technological development has encountered exponential advancement. Furthermore, the advancement in technology, directly or indirectly related to Internet has shown exceptional progress in the near times. Typically, a user uses the internet to extract any information. In such a case, the information is stored in plurality of databases consisting of billions of terabytes of digital data. Moreover, with advancements in technology, information retrieval has become more streamlined and systematic. Subsequently, the user uses a search engine to extract information from the Internet. Typically, the user feeds a user query into the search engine for extracting the relevant information therefrom. In this regard, the user query consists of one or more strings of characters in form of keywords that are fed to the search engine. The keywords provided to the search engine are analysed to display search results to the user.
Conventionally, the search engine is optimized to provide the user with at least one autocomplete suggestion for completing the user query. In an example, the search engine may analyse user's browser history to identify previous browsing data. In such an example, the search engine is further operable to provide the autocomplete suggestions based on previous browsing data. However, in some instance, the user's browser history may not be accessible. For example, if the user is conducting a search via a private browsing window, the user's browser history may not be accessible for providing relevant autocomplete suggestions. Moreover, there is no provision in the search engine for providing autocomplete suggestion based on the user query formed of partial name (namely, for searching a name of an individual). In an instance, the user may proceed with the partial query and has to surf through entire search results to extract the relevant information. In another instance, the user may proceed by completing the partial name based query. In such an instance, manual efforts are required by the user to complete the aforesaid partial name based query. Such manual efforts are time-consuming and thereby, make the process of extracting the information cumbersome and inefficient. Moreover, if the user is not able to accurately recall the complete name of the individual, the search results may include irrelevant information.
Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with the existing technique of providing suggestion to the user query.
SUMMARY
The present disclosure seeks to provide a system that provides at least one suggestion for completing the user query. The present disclosure also seeks to provide a method of providing at least one suggestion for completing a user query. The present disclosure also seeks to provide a computer readable medium containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps for providing at least one suggestion for completing a user query. The present disclosure seeks to provide a solution to the existing problem of conventional autocomplete techniques that provide irrelevant and sub-optimal suggestions for completing user queries. An aim of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in prior art, and provides an efficient and reliable system and method for providing at least one suggestion for completing user queries.
In one aspect, an embodiment of the present disclosure provides a system that provides at least one suggestion for completing a user query, wherein the system includes a computer system, characterized in that the system comprises:
- a database arrangement operable to store a name corpus and an ontology; and
- a processing module communicably coupled to the database arrangement, the processing module operable to:
- receive the user query having at least one entity unit;
- identify an entity type of the at least one entity unit based on the ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determine the at least one suggestion for the entity unit, having name entity type, using the name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- provide the at least one suggestion for completing the user query.
In second aspect, an embodiment of the present disclosure provides a method for providing at least one suggestion for completing a user query, wherein the method includes using a computer system, characterized in that the method comprising:
- receiving the user query having at least one entity unit;
- identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- providing the at least one suggestion for completing the user query.
In third aspect, an embodiment of the present disclosure provides a computer readable medium containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps for providing at least one suggestion for completing a user query, the method comprising the steps of:
- receiving the user query having at least one entity unit;
- identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- providing the at least one suggestion for completing the user query.
Embodiments of the present disclosure substantially eliminate or at least partially address the aforementioned problems in the prior art, and enables an efficient, effective, seamless and optimal system of providing relevant at least one suggestion for completing a user query having the at least one entity unit with name entity type.
Additional aspects, advantages, features and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative embodiments construed in conjunction with the appended claims that follow.
It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the present disclosure as defined by the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers.
Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:
FIG. 1 is a block diagram of a system that provides at least one suggestion for completing a user query, in accordance with an embodiment of the present disclosure; and
FIG. 2 is an illustration of steps of a method for providing at least one suggestion for completing a user query, in accordance with an embodiment of the present disclosure.
In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the non-underlined number to the item. When a number is non-underlined and accompanied by an associated arrow, the non-underlined number is used to identify a general item at which the arrow is pointing.
DETAILED DESCRIPTION OF EMBODIMENTS
In overview, embodiments of the present disclosure are concerned with providing suggestions related to the user query and specifically to, analysing context of the user query and providing suggestions for completion thereof.
The following detailed description illustrates embodiments of the present disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practicing the present disclosure are also possible.
In one aspect, an embodiment of the present disclosure provides a system that provides at least one suggestion for completing a user query, wherein the system includes a computer system, the system comprising:
- a database arrangement operable to store a name corpus and an ontology; and
- a processing module communicably coupled to the database arrangement, the processing module operable to:
- receive the user query having at least one entity unit;
- identify an entity type of the at least one entity unit based on the ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determine the at least one suggestion for the entity unit, having name entity type, using the name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- provide the at least one suggestion for completing the user query.
In another aspect, an embodiment of the present disclosure provides a method of providing at least one suggestion for completing a user query, wherein the method includes using a computer system, the method comprising:
- receiving the user query having at least one entity unit;
- identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- providing the at least one suggestion for completing the user query.
The present disclosure provides the system and the method for providing suggestions for completing user queries. The system enables to provide suggestions for user queries that are unrelated to information previously accessed by users. Thus, the system enables to provide relevant suggestions for completing the user query even when information associated with the user's browser history is unavailable. The described system reduces an amount of effort exerted by the user by providing fast and relevant suggestions for completing the user queries. Additionally, the aforementioned system provides at least one suggestion for a name based user query (for searching a particular person). Beneficially, such at least one suggestion allows the user to extract information about the desired person in a quick and efficient manner. Furthermore, the system provides the at least one suggestion to the user even if the user is unaware of the complete name of the person, thereby, saving the user's time to complete the query himself/herself. Therefore, extra manual efforts are not required from the user. Beneficially, the described system enables an efficient and a systematic approach of auto suggestion to the incomplete user query giving the utmost relevant information from the name corpus.
The computer system relates to at least one computing unit comprising a central storage system, processing units and various peripheral devices. Optionally, the computer system relates to an arrangement of interconnected computing units, wherein each computing unit in the computer system operates independently and may communicate with other external devices and other computing units in the computer system.
The term system that provides is used interchangeably with the term system for providing, wherever appropriate i.e. whenever one such term is used it also encompasses the other term.
Throughout the present disclosure, the term user query relates to a string of words (namely, one or more keywords, key-phrases, sentences and so forth) provided by a user in order to extract relevant information. Moreover, the relevant information pertains to a field of user's interest. Furthermore, the user query may be partially complete. The user provides the user query comprising the one or more keywords therein, to retrieve data associated with the domain of user-interest. In the first example, a user may have names of individual related to the field of patents as the domain of user-interest. Subsequently, the user may provide @ABC as the user query. Additionally, the user query may be partially complete and may need one or more words for completion thereof. In such an instance, the user query @ABC is received, wherein the user query relates to the domain of user-interest associated with names of individuals related to the field of patents. Moreover, the processing module is operable to analyse the user query based on context of elements included therein. Specifically, the user query is a name based user query. More specifically, the user query provide to the user comprises name of at least one individual therein. It will be appreciated that the name based user query is provided by the user to in order to extract relevant information about t specific individual. Optionally, the name based user query may be partially complete and may need one or more words for completion thereof. Additionally, the user query is in text format. Optionally, the user query may be provided using a command prompt (cmd), user interface (UI) and so forth.
As mentioned previously, the method comprises receiving the user query. Specifically, the user query is provided to the processing module via a communication module. The processing module is operable to receive the user query as an input and provide the at least one suggestion for completing the user query.
Throughout the present disclosure, the term at least one entity unit used herein relates to an element used to form the user query. Optionally, the at least one entity type is typically written with a space on either side. In this regard, the at least one entity unit constitutes for the user query.
Throughout the present disclosure, the term database arrangement as used herein relates to an organized body of digital information regardless of the manner in which data or an organized body thereof is represented. Optionally, the database arrangement may be hardware, software, firmware and/or any combination thereof. For example, the organized body of related data may be in the form of a table, a map, a grid, a packet, a datagram, a file, a document, a list or in any other form. The database arrangement includes any data storage software and systems, such as, for example, a relational database like IBM DB2 and Oracle 9. Furthermore, the database arrangement refers to the software program for creating and managing one or more databases. Optionally, the database arrangement is operable to supports relational operations, regardless of whether it enforces strict adherence to the relational model, as understood by those of ordinary skill in the art. Additionally, optionally, the database arrangement is populated by data elements. Furthermore, the data elements may include data records, bits of data, cells are used interchangeably herein and all intended to mean information stored in cells of a database. Optionally, the database arrangement is operable to store user-specific content
Furthermore, as mentioned previously, the system for providing at least one suggestion for completing the user query comprises the processing module communicably coupled to the database arrangement. The processing module is operable to receive the user query through the communication module.
Throughout the present disclosure, the term processing module used herein relates to a computational element that is operable to respond to and process instructions that carry out the method. Optionally, the processing module includes, but is not limited to, a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, or any other type of processing circuit. Furthermore, optionally, the processing module may refer to one or more individual processors, processing devices and various elements associated with a processing device that may be shared by other processing devices. Additionally, the one or more individual processors, processing devices and elements are arranged in various architectures for responding to and processing the instructions that drive the system. In such a case, the communication module enables an exchange of the user query.
Optionally, the processing module may be a computer-implemented module. More optionally, the user query may be provided by means of a graphical user interface (GUI), command line (cmd), drag and drop, and so forth.
As mentioned previously, the method for providing at least one suggestion for completing the user query comprises identifying the entity type of the at least one entity unit based on the ontology, wherein the entity unit having the predefined signature is identified as the name entity type. Specifically, the processing module is operable to identify the entity type of the at least one entity unit based on the ontology, wherein the entity unit having the predefined signature is identified as the name entity type.
Throughout the present disclosure, the term ontology relates to a set of concepts (namely, information, ideas, data, semantic associations and so forth) in a field (namely, subject area, domain and so forth) that comprises types and properties of the set of entities, concepts and semantic association thereof. Moreover, the ontology provides a structured, optimal and relevant set of concepts pertaining to the user's field of interest. Furthermore, the ontology may be used in scientific research, academic studies, market analysis and so forth. Optionally, the ontology may include concepts in form of text, image, audio, video, or any combination thereof. Additionally, the ontology may provide information on how a certain entity in a certain field may be associated with one or more entities in multiple fields.
Throughout the present disclosure, the term entity type used herein relates a specific field to which the at least one entity unit is associated with. Optionally, the at least one entity unit is classified into at least one entity type based on contextual meaning thereof. The entity type comprises the name entity type. The name entity type is determined based on the predefined signature present within the at least one entity unit.
Throughout the present disclosure, the term predefined signature used herein relates to an arrangement of at least one element (such as special characters, words) within the at least one entity unit.
In an example, the processing module may use the algorithm, having the predefined signature stored therein, to identify the entity type of the at least one entity unit. In such an example, the processing module compares each entity unit with the predefined signature. In this regard, when an entity unit is matched with the predefined signature, the entity type of the entity unit is determined as the name entity type.
Optionally, the predefined signature comprises at least one predefined character within the at least one entity unit. Optionally, the predefined character includes a set of the plurality of special characters. Alternatively, optionally, the predefined character includes a single special character.
Example of such special characters may include, but are not limited to, and In one embodiment, the predefined signature includes the special character at a starting position of the at least one entity unit. In another embodiment, the predefined signature includes the special character at a last position of the at least one entity unit. In yet another embodiment, the predefined signature includes a special character present at any position within the at least one entity unit. In various other embodiments, the predefined signature includes multiple special characters present at different positions within the at least one entity unit. Furthermore, optionally, such multiple special characters can be arranged at a predefined position. Alternatively, optionally, such multiple characters can be arranged at any random position
Throughout the present disclosure, the term name entity type used herein relates to a field comprising proper nouns associated with a person. In other words, the name entity type relates to names of individuals.
In an example, the processing module receives the user query @leosmith from the communication module, wherein the @leo-smith is an entity unit. In such an example, the predefined signature is defined as @XXX, wherein is a special character followed by rest of the entity unit XXX. Subsequently, the processing module compares the entity unit @leo-smith with the predefined signature @XXX and determines that the entity unit starts with the special character Thereafter, the processing module identifies the entity type of the entity unit @leo-smith as name entity type.
Optionally, the method for providing at least one suggestion for completing the user query comprises generating a plurality of strings for the user query, wherein the plurality of strings comprises the at least one entity unit among the plurality of entity units. Specifically, the processing module is operable to generate the plurality of strings for the user query.
Throughout the present disclosure the term plurality of strings used herein relates to a set of strings, wherein each string comprises at least one entity unit. It will be appreciated that the plurality of strings may comprise one entity unit, two entity units, three entity units and so forth. Furthermore, the entity units are arranged in a sequential order within the corresponding plurality of strings.
Furthermore, optionally, the plurality of strings for user query is generated based on a n-gram model. It will be appreciated that the ngram model relates to a contiguous sequence of 'n' items from a given user query, wherein 'n' represents number of entity units within the plurality of strings. In this regard, the string having one entity unit is referred as unigram or one-gram, the string having two entity units are referred as bigram or two-gram, the string having three entity units are referred as trigram or three-gram. Similarly, based on the number of the entity units, the plurality of strings are referred as four-gram, five-gram, and so on. Thereafter, the plurality of strings are analysed to identify the entity type thereof, based on the ontology
As mentioned previously, the method for providing the at least one suggestion for completing the user query comprises determining the at least one suggestion for the entity unit, having the name entity type, using the name corpus, wherein the name corpus comprises the plurality of names arranged in form of the ordered tree data structure. Specifically, the processing module is operable to determine the at least one suggestion for the entity unit, having the name entity type, using the name corpus.
Throughout the present disclosure, the term name corpus used herein, relates to a collection of plurality of names of different persons. Optionally, the names are acquired from web crawling and/or other routines. Furthermore, the name corpus is updated on regular intervals. In an embodiment, the name corpus can have a fixed volume. In another embodiment, the name corpus can comprise streaming data. The plurality of names stored are arranged as structured data within the name corpus. Specifically, the plurality of names are organized in form of finite state transducer or trie. In such a case, the finite state transducer comprises a root node and corresponding plurality of child nodes for each of the plurality of names. Furthermore, the finite state transducer may comprise plurality of root nodes based on the plurality of names stored in the name corpus. Furthermore, the root node is empty and a first child node represents first letter of a name stored in the name corpus. Each subsequent child node is another letter of the name until an end of the name is reached. It will be appreciated that each of the child nodes may have further subsequent child nodes. In this regard, the child nodes near the bottom of the ordered tree data structure have few children nodes as compared to the child node near the top of the ordered tree data structure.
Optionally, the name corpus comprises names of authors. In this regards, the name corpus is generated by data acquired from web crawling and/or other routines. In an example, the name corpus may include the name of authors related to patents, publications, clinical trials, books and so forth.
In an example, the name of two different persons are john smith and jack smith. In such an example, the names are stored in form as stored by a finite state transducer. In this regard, the root node is empty and the first child node comprises the letter J, followed by a second child node, third child node and fourth child node representing letters Ο, H and N respectively. Regarding the name jack, both of the names john and jack share a same root node and the first child node, wherein the first child node represents the letter J. Thereafter, a second branch of second child node is generated which represents the letter a, followed by a third child node and fourth child node of the second branch representing letters c and k. Furthermore, the finite state transducer also comprises another root node and multiple child nodes for Smith. In such a case, the first child node comprises a letter S, followed by a second child node, third child node, fourth child node and fifth child node representing letters m, I, t and h respectively.
In the abovementioned example, if the user query comprises @Jsmith, the processing module analyses the name corpus and identifies that the names john smith and jack smith as the suggestions completing the user query. However, if the user query is @jo-smith, the processing module analyses the name corpus and identifies that the name john smith as the suggestion for completing the user query.
Optionally, different aliases of the plurality of names are stored in a plurality of combinations in the name corpus. It will be appreciated that different aliases of the plurality of names relate to an alternate name that refers to a complete name of the person. In an example, if the name of the person is John Smith, the different aliases of the name may include, but are not limited to, J Smith, John S, Jo Smith,
Joh Smit, Smith J, S John, Johnny. It will be appreciated that storing the plurality of names in the ordered tree structure allows for the generation of different aliases. Beneficially, such storage of the different aliases of the plurality of names allows for providing at least one relevant suggestion even if the user query is partially complete.
Furthermore, the method comprises providing the at least one suggestion for completing the user query. For example, when the at least one suggestion associated with the name entity type is identified, the at least one suggestion is provided to the user for completing the partially complete user query. In an example, the at least one suggestion may be provided on a portion of a screen, below the user query. Additionally, the suggestion may have different font attributes (such as font color, font size, capitalization, font type and so forth) than the font used by the user for providing the user query.
Optionally, the method further comprises providing affiliations related to the at least one suggestion for the entity units associated with the name entity type. The plurality of names in the name corpus are stored along with information pertaining to a field related to each of the corresponding names. In other words, each of the plurality of names are tagged with the corresponding information pertaining to field of work. Optionally, the information pertaining to the field includes at least one of: a name of institution in which the person with the corresponding name is working; field of specialization; most famous work (for example such as, publication; reports; patents and so forth). In an example, the plurality of names in the name corpus may include the name John smith. In such an example, when the user inputs the user query depicting @john-s, the processing module may provide the suggestion john smith along with the affiliation, wherein affiliation may represent Doctor at XXX hospital with specialization in dermatology.
Beneficially, providing such affiliations allows the user to distinguish between the persons with same names. In such a case, if there are two persons with the same name john smith, the processing module will provide both of the names separately along with their affiliation. The processing module, in such an example, may provide the following suggestion:
john smith - Doctor at XXX hospital with specialization in dermatology john smith - Researcher as ABC university
It will be appreciated that the above-mentioned suggestions can be represented in various other manners as well.
Optionally, the at least one entity type further comprises a concept entity type and an others entity type. In this regard, the at least one entity unit is classified into at least one entity type based on contextual meaning thereof. Furthermore, the at least one entity unit of the user query that corresponds to one or more concepts in the ontology are included in the concept class. Moreover, the at least one entity unit of the user query that does not correspond to the one or more concepts in the ontology are included in the others entity type. In an example, the at least one entity type of the user query may be tagged or labelled with a type (such as the concept or others type). In another example, the at least one entity unit of the user query may be arranged in cells of a tabular arrangement in respective columns associated with the type thereof. In a second example, the user enters a user query cell division and @leo-smith. In such an example, the entity unit cell division is a concept present in the ontology associated with lifesciences and is thus, classified as the concept entity type. Furthermore, the entity unit and does not correspond to any of the concepts in the ontology and are classified in the others entity type.
Optionally, in this regard, the method further comprises concatenating the at least one entity unit of the concept entity type occurring at the farthest position with each of the at least one entity unit occurring thereafter, to obtain a concatenated string of entity units. In such a case, the at least one entity unit of the concept entity type occurring at the farthest position within the user query is linked with the at least one entity unit occurring thereafter to form the concatenated string of entity unit, such that the identified at least one entity unit of the concept entity type occurring at the farthest position within the user query appears at a first position in the resulting concatenated string of entity units.
Furthermore, optionally, the method comprises identifying at least one suggestion associated with the concatenated string of entity units using the ontology. For example, keywords, elements or concepts of the ontology that are associated with the concept of the concatenated string of entity units are identified as the at least one suggestion for the user query. In an example the ontology related to subject matter life science may be developed and a user query platelets, red blood cell and white blood cell may be provided by a user. Such a user query may be used to obtain the concatenated string of entity units red blood cell and white blood cell. Furthermore, the ontology may include keywords such as count, structure, range and so forth that cooccur in the ontology, wherein the keywords are associated with the concept of the concatenated string of entity units red blood cell and white blood cell. It will be appreciated that the concept red blood cell and white blood cell are common between the ontology and the user query. Therefore, the keywords, key-phrases and/or elements that are common for the concept in the ontology and the user query may be identified as suggestions for completion of the user query. Optionally, identifying the at least one suggestion further comprises discarding the at least one entity unit occurring at a first position in the concatenated string of entity unit. Furthermore, when the at least one suggestion for the concatenated string is not identified, the concatenated string may be traversed and the at least one entity unit at a first position thereof may be discarded. The discarded at least one entity unit may belong to the concept entity type or the others entity type. In an example, the ontology is developed related to a subject-matter of lung cancer. Furthermore, a concatenated string of entity units non small cell and liver cancer is obtained. Moreover, entity unit of the concatenated string are ordered and non small cell is assigned a first position, and is assigned a second position and liver cancer is assigned a third position. In such an instance, if no suggestions for the concatenated string of entity units can be identified, the entity unit non small cell occurring at the first position is discarded.
Optionally, the at least one suggestion for the at least one entity unit having the name entity type is identified based on the entity units having the concept entity type and the others entity type. In this regard, the processing module is operable to identify the at least one suggestion for the at least one entity unit having the name entity type based on the entity units having the concept entity type and the others entity type. In such a case, the concept entity type and the others entity type are linked with the name corpus. In other words, all the names related to the identified concept entity type and the others entity type are provided as the at least one suggestion for the at least one entity unit having the name entity type.
In one embodiment, when the user inputs the user query having both concept entity type and name entity type, the processing module is operable to determine the entity units that have concept entity type and the others entity type based on the ontology. In an example, the user query is lung cancer @J smith. In such an example, the processing module identifies the entity unit lung cancer as the concept entity type and the entity unit @j smith as the name entity unit. In such a case, the processing module may provide those suggestions for the entity unit with name entity type that are related to the concept entity type lung cancer. For example, if the name corpus includes names jack smith and john smith, the processing module would determine which of the above mentioned names are related to the lung cancer by analysing their affiliations. Thereafter, the processing module determines that jack smith has a research paper related to a field of lung cancer, whilst john smith is not linked with the field of lung cancer. In such a case, the processing module may provide a suggestion of jack smith.
Optionally, the plurality of names in the name corpus are arranged based on the concept entity type. In other words, the names among the plurality of names related to a specific field are stored under the same category. Furthermore, optionally, when the name among the plurality of names is related to one or more fields, the name is then stored under both of those categories. In an example, the name corpus comprises a plurality of categories, wherein each category is related to concept entity type. In such an example, the name jack Smith has research papers published for lung cancer and breast cancer. In such a case, the name corpus may include two categories with heading 'lung cancer and breast cancer, wherein the name jack smith is stored under both of these categories.
Optionally, the method for providing at least one suggestion for comprising the user query comprises: developing the ontology using at least one curated database by: applying conceptual indexing to plurality of entity unit stored in the at least one curated database; identifying semantic associations, between the plurality of entity units, established in the at least one curated database; and identifying at least one class tagged with the plurality of entity units in the at least one curated database.
Throughout the present disclosure, the term class relates to a collection (namely, cluster, group and so forth) of contextually similar text, audio, video, image or a combination thereof. Furthermore, the class may include many synonyms, abbreviations, linguistic variations, morphological forms and/or derivational entities for plurality of dataunits associated therewith. In an example, pain may be associated with a class containing similar entities like cramp, ache, discomfort, spasm and so forth.
Optionally, field of the at least one curated database is related to the developed ontology. Specifically, the at least one curated database may comprise information providing details on associations between a plurality of concepts. Additionally, the ontology is developed to include relevant information extracted from the at least one curated database pertaining to the field of user's interest. Optionally, the at least one curated database includes information in form of text, image, audio, video, or any combination thereof.
In an example, at least one curated database may comprise information related to biomedical entities, genes, proteins, drugs, diseases, species, pathways, biological processes, molecular functions, side effects, drug labels, clinical trial parameters, patient demographics and many other semantic types thereof. Furthermore, the at least one curated database may be extracted to build a Life Science ontology (including custom dictionary and metathesaurus) containing synonyms, derivational and functional form of different biomedical entities as well as Medical Subject Headings (MeSH). Furthermore, optionally, the at least one curated database and ontology extracted from thereof may include data about authors, geography and other biological and non-biological entities.
Furthermore, there is disclosed a computer readable medium containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps for providing at least one suggestion for completing a user query. The method comprising the steps of receiving the user query having at least one entity unit; identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type; determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and providing the at least one suggestion for completing the user query.
Optionally, the computer readable medium comprises one of a floppy disk, a hard disk, a high capacity read only memory in the form of an optically read compact disk or CD-ROM, a DVD, a tape, a read only memory (ROM), and a random access memory (RAM).
DETAILED DESCRIPTION OF THE DRAWINGS
Referring to FIG. 1, illustrated is a block diagram of a system 100 that provides at least one suggestion for completing a user query, in accordance with an embodiment of the present disclosure. The system 100 comprises a database arrangement 102 operable to store a name corpus and an ontology; a processing module 104 communicably coupled to the database arrangement 102. The processing module 104 is operable to receive the user query and provide the at least one suggestion for completing the user query.
Referring to FIG. 2, illustrated are steps of a method 200 for providing at least one suggestion for completing a user query, in accordance with an embodiment of the present disclosure. At a step 202, the user query having at least one entity unit is received. At a step 204, the entity type of at least one entity unit is identified based on the ontology, wherein an entity unit having a predefined signature is identified as a name entity type. At a step 206, the at least one suggestion for the entity unit having the name entity type is determined using a name 5 corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure. At a step 208, the at least one suggestion is provided for completing the user query.
Modifications to embodiments of the present disclosure described in the foregoing are possible without departing from the scope of the present 10 disclosure as defined by the accompanying claims. Expressions such as including, comprising, incorporating, have, is used to describe and claim the present disclosure are intended to be construed in a nonexclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is 15 also to be construed to relate to the plural.

Claims (13)

1. A system that provides at least one suggestion for completing a user query, wherein the system includes a computer system, characterized in that the system comprises:
- a database arrangement operable to store a name corpus and an ontology; and
- a processing module communicably coupled to the database arrangement, the processing module operable to:
- receive the user query having at least one entity unit;
- identify an entity type of the at least one entity unit based on the ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determine the at least one suggestion for the entity unit, having the name entity type, using the name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- provide the at least one suggestion for completing the user query.
2. A system of claim 1, characterized in that the predefined signature comprises at least one predefined character within the at least one entity unit.
3. A system of claim 1, characterized in that the name corpus comprises names of authors.
4. A system of claim 1, characterized in that the processing module is further operable to provide affiliations related to the at least one suggestion for the entity units associated with the name entity type.
5. A method for providing at least one suggestion for completing a user query, wherein the method includes using a computer system, characterized in that the method comprising:
- receiving the user query having at least one entity unit;
- identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- providing the at least one suggestion for completing the user query.
6. A method of claim 5, characterized in that different aliases of the plurality of names are stored in plurality of combinations in the name corpus.
7. A method of claim 5, characterized in that the at least one entity type further comprises a concept entity type and an others entity type.
8. A method of claim 7, characterized in that the method further comprising:
- concatenating the at least one entity unit of the concept entity type occurring at the farthest position with each of the at least one entity unit occurring thereafter, to obtain a concatenated string of entity units; and
- identifying at least one suggestion associated with the concatenated string of entity units using the ontology.
9. A method of claim 8, characterized in that the at least one suggestion for the at least one entity unit having the name entity type is identified based on the entity units having the concept entity type and the others entity type.
10. A method of claim 5, characterized in that the name corpus comprises names of authors.
11. A method of claim 5, characterized in that the method further comprising providing affiliations related to the at least one suggestion for the entity units associated with the name entity type.
12. A method of claim 5, characterized in that the predefined signature comprises at least one predefined character within the at least one entity unit.
13. A computer readable medium containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps for providing at least one suggestion for completing a user query, the method comprising the steps of:
- receiving the user query having at least one entity unit;
- identifying an entity type of the at least one entity unit based on an ontology, wherein an entity unit having a predefined signature is identified as a name entity type;
- determining the at least one suggestion for the entity unit, having the name entity type, using a name corpus, wherein the name corpus comprises plurality of names arranged in form of an ordered tree data structure; and
- providing the at least one suggestion for completing the user query.
GB1804910.6A 2018-03-27 2018-03-27 System and method for providing suggestions for completing user query Withdrawn GB2572542A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1804910.6A GB2572542A (en) 2018-03-27 2018-03-27 System and method for providing suggestions for completing user query

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1804910.6A GB2572542A (en) 2018-03-27 2018-03-27 System and method for providing suggestions for completing user query

Publications (2)

Publication Number Publication Date
GB201804910D0 GB201804910D0 (en) 2018-05-09
GB2572542A true GB2572542A (en) 2019-10-09

Family

ID=62068228

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1804910.6A Withdrawn GB2572542A (en) 2018-03-27 2018-03-27 System and method for providing suggestions for completing user query

Country Status (1)

Country Link
GB (1) GB2572542A (en)

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Also Published As

Publication number Publication date
GB201804910D0 (en) 2018-05-09

Similar Documents

Publication Publication Date Title
CN107341264B (en) Electronic medical record retrieval system and method supporting user-defined entity
Iroju et al. A systematic review of natural language processing in healthcare
US11275905B2 (en) Systems and methods for semantic search and extraction of related concepts from clinical documents
US10977444B2 (en) Method and system for identifying key terms in digital document
US11270073B2 (en) Method and system for extracting entity information from target data
CN109192255B (en) Medical record structuring method
US11468070B2 (en) Method and system for performing context-based search
US9613125B2 (en) Data store organizing data using semantic classification
US9239872B2 (en) Data store organizing data using semantic classification
Neves et al. Moara: a Java library for extracting and normalizing gene and protein mentions
Friedman et al. Natural language and text processing in biomedicine
TWI735380B (en) Natural language processing method and computing apparatus thereof
US9081847B2 (en) Data store organizing data using semantic classification
Gerstmair et al. Intelligent image retrieval based on radiology reports
CN112347204B (en) Method and device for constructing drug research and development knowledge base
Moreno et al. Ontology-based information extraction of regulatory networks from scientific articles with case studies for Escherichia coli
Leaman et al. Chemical identification and indexing in full-text articles: an overview of the NLM-Chem track at BioCreative VII
US11544304B2 (en) System and method for parsing user query
US11269937B2 (en) System and method of presenting information related to search query
GB2572542A (en) System and method for providing suggestions for completing user query
US11200261B2 (en) System and method for retrieving data records
Samuel et al. Mining online full-text literature for novel protein interaction discovery
De Maio et al. Text Mining Basics in Bioinformatics.
Hinze et al. Capisco: low-cost concept-based access to digital libraries
EP2720160A2 (en) Data store organizing data using semantic classification

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)