CN110569421A - search method based on chemical industry - Google Patents

search method based on chemical industry Download PDF

Info

Publication number
CN110569421A
CN110569421A CN201910780588.XA CN201910780588A CN110569421A CN 110569421 A CN110569421 A CN 110569421A CN 201910780588 A CN201910780588 A CN 201910780588A CN 110569421 A CN110569421 A CN 110569421A
Authority
CN
China
Prior art keywords
search
compound
commodity
attributes
brand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910780588.XA
Other languages
Chinese (zh)
Inventor
曹磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Moku Data Technology Co Ltd
Original Assignee
Shanghai Moku Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Moku Data Technology Co Ltd filed Critical Shanghai Moku Data Technology Co Ltd
Priority to CN201910780588.XA priority Critical patent/CN110569421A/en
Publication of CN110569421A publication Critical patent/CN110569421A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a search method based on the chemical industry, which comprises the following steps: firstly, preparing data; then splitting the search description into a plurality of keywords; secondly, performing a first search and a second search through keywords, wherein the first search is used for obtaining brands and/or stores which are successfully matched with the keywords, the successfully matched brands and/or stores and all attributes thereof are used as search results, the second search is used for obtaining commodities matched with the keywords on the basis of the first search, and all attributes of the successfully matched commodities are used as search results; and finally, feeding back the search results of the first search and the second search. The invention realizes the search of the chemical industry by adopting the CAS number and the non-CAS number for searching, and realizes the data clustering by preparing the data, thereby effectively improving the searching speed and relieving the storage pressure.

Description

Search method based on chemical industry
Technical Field
The invention relates to the technical field of data search, in particular to a search method based on the chemical industry.
background
with the development of society, search technologies are distributed to all websites, whether proprietary search websites or all e-commerce websites and even company internal systems, without departing from the search technologies. The searching rate directly affects the experience of users, the background data volume continuously increases due to the increase of services and the accumulation of time, and when the first two million data are expanded to the next several tens of millions or even hundreds of millions of data, the traditional database based on disk storage cannot respond to fuzzy search in time.
moreover, the search of the chemical industry is different from the ordinary data search, the chemical industry compounds and the like have unique writing modes, such as molecular formulas, molecular weights, INCI strings, SMILES strings and the like of the compounds are likely to participate in the search, most of the existing ordinary searching modes are literal searches, and are greatly different from the search of the chemical industry, and how to introduce the search of the chemical industry does not exist in the prior art, and the search speed of the chemical industry is not promoted by related technologies.
therefore, it is necessary to provide a search method based on the chemical industry, so as to implement the search in the chemical industry, and effectively improve the search rate and relieve the storage pressure.
disclosure of Invention
the invention aims to provide a search method based on the chemical industry, which realizes the search of the chemical industry, effectively improves the search rate and relieves the storage pressure.
In order to solve the problems in the prior art, the invention provides a searching method based on the chemical industry, which comprises the following steps:
Preparing data, and establishing a brand library, a shop library, a compound library and a commodity library which are stored in a cluster manner;
Receiving an input search description, and splitting the search description into a plurality of keywords;
Performing a first search for brands and/or stores in the brand repository and the store repository that match the plurality of keywords; if the first search is that at least one successfully matched brand and/or store exists, caching the ID of the successfully matched brand and/or the ID of the store, and taking the successfully matched brand and all attributes thereof and/or the store and all attributes thereof as search results;
Performing a second search, wherein the second search uses a compound CAS number search or a non-compound CAS number search to search the commodity library for commodities matched with the keywords; if the second search is that the commodities successfully matched with the keywords exist, caching the successfully matched commodities and compounds contained in the commodities, and taking all attributes of the successfully matched commodities as search results;
if the first search is successful, the cached brand ID and/or store ID is brought into the second search, and if the second search is successful, feedback information obtained by sorting the second search result is fed back; if the second search is unsuccessful, feeding back a first search result, wherein the first search result comprises at least one brand successfully matched and all attributes thereof and/or at least one shop and all attributes thereof;
If the first search matching is unsuccessful, performing second search, and if the second search matching is successful, feeding back feedback information obtained by sorting the second search result; and if the second search is unsuccessful, feeding back a search result.
optionally, in the chemical industry based search method, the brand library, the store library, the compound library, and the commodity library are stored in a cluster;
The brand library comprises a plurality of brands and attributes thereof, wherein the attributes of the brands comprise brand IDs, brand names, brand identifications and brand blacklists;
The store library comprises a plurality of stores and attributes thereof, wherein the attributes of the stores comprise store IDs, store names, store marketing data, store types, store points, store grades, areas where the stores are located, store contacts and store blacklists;
the compound library comprises a plurality of compounds and attributes thereof, wherein the attributes of the compounds comprise compound ID, compound name, compound alias, compound CAS number, compound molecular formula, compound molecular weight, compound INCI string, compound SMILES string, compound label, class to which the compound belongs, and attribute group of the compound;
the commodity library includes a plurality of commodities and attributes thereof, and the attributes of the commodities include a commodity ID, a commodity name, a commodity number, commodity marketing data, an ID and search attribute of a brand to which the commodity belongs, an ID and search attribute of a store to which the commodity belongs, an ID and search attribute of a compound contained in the commodity, a commodity specification, a commodity price, a commodity purity, a commodity point, and a commodity shelf life.
optionally, in the chemical industry-based search method, in a search process, the plurality of keywords are matched with search attributes in the brand library, the store library, the compound library, and the commodity library;
the search attribute of the brand includes a brand name;
the search attribute of the store comprises a store name;
the search attributes of the compound include a compound name, a compound alias, a compound CAS number, a compound molecular formula, a compound molecular weight, an INCHI string of the compound, and a SMILES string of the compound;
The search attribute of the commodity includes a commodity name and a commodity item number.
optionally, in the chemical industry-based search method, after splitting the search description into a plurality of keywords and before performing a first search, the method further includes the following steps:
And conducting escape and first filtering on the keywords, wherein the escape is used for escaping the keywords into the keywords which can be matched with the search attributes, and the first filtering is used for filtering the keywords which cannot be matched with the search attributes.
optionally, in the chemical industry-based search method, if there is no keyword matched with the search attribute after the first filtering, the search is fed back without a result.
optionally, in the chemical industry-based search method, after the first search, the keywords are filtered for the second time, the keywords applied to the first search are filtered out by the second filtering, and if no keyword matched with the search attribute exists after the second filtering, the first search result is fed back, including at least one brand and all attributes thereof that are successfully matched and/or at least one store and all attributes thereof.
Optionally, in the chemical industry-based search method, if a brand and/or a store name includes a search attribute of a commodity or a compound, setting a blacklist for the brand and/or the store, where the blacklist includes the search attribute of the commodity or the compound included in the brand and/or the store name;
when the first search is carried out, if the blacklist is successfully matched with any keyword in the keywords, the brand and/or the shop to which the blacklist matched with the keywords belongs cannot be searched, and the keywords matched with the blacklist are not filtered out in the second filtering.
Optionally, in the chemical industry-based search method, when the first search is performed, if more than one brand and/or store is successfully matched with the plurality of keywords, the successfully matched brands and/or stores are sequentially displayed according to a descending order of matching scores.
Optionally, in the chemical industry-based search method, when performing the second search, the method includes the following steps:
If the keywords subjected to the second search only contain the CAS numbers of the compounds with the correct formats, searching by adopting the CAS numbers of the compounds, caching all the commodities if all the commodities to which the compounds corresponding to the CAS numbers of the compounds belong are searched, taking all the searched commodities and all the attributes thereof as search results, and if the compounds are not matched, unsuccessfully searching and matching for the second time;
if the keyword for the second search contains one or more of the name of the compound, the alias of the compound, the molecular formula of the compound, the molecular weight of the compound, the INCI string of the compound, the SMILES string of the compound, the name of the commodity and commodity number search attributes, adopting a non-compound CAS number search, and if the keyword is the search attribute of the commodity and is matched with the plurality of commodities, caching the commodities and the compounds contained in the commodities and taking all the attributes of all the commodities as search results; if the keyword is the search attribute of the compound and is matched with the multiple compounds and all the commodities to which the compounds belong, caching all the commodities, taking all the searched commodities and all the attributes thereof as search results, and if the commodities and the compounds are not matched, searching for the matching for the second time and not matching successfully;
And if the keywords for the second search contain the search attribute of the compound CAS number search and the search attribute of the non-compound CAS number search, performing the compound CAS number search and the non-compound CAS number search, and simultaneously feeding back results of the two searches.
optionally, in the chemical industry-based search method, if the second search is successful in matching, after the search result is obtained and before the information is fed back, the method further includes the following steps:
sorting the search results, and classifying all the commodities according to the IDs of the contained compounds, wherein all the commodities with the same ID of the contained compounds are the same piece of feedback information;
And if the second search is successful in matching, feeding back at least one piece of feedback information.
optionally, in the chemical industry-based search method, after the user receives the feedback information, if the user requests to retrieve all attributes of the compounds contained in the commodities in the second search result, all attributes of the compounds contained in the commodities in the second search result are retrieved and fed back.
optionally, in the chemical industry based search method, the step of storing the brand library, the store library, the compound library, and the commodity library in a cluster includes:
acquiring data in a shop system, a brand system, a commodity system, a compound system and a marketing system through middleware;
The middleware transmits the data to the search engine;
the search engine integrates the received data according to a specified format;
sending the integrated data to the brand repository, the store repository, the compound repository, and the commodity repository in the cluster.
In the searching method based on the chemical industry, the databases are stored in a cluster mode, so that the storage pressure is relieved, and the searching pressure of the databases is reduced and the searching speed is improved by adopting a two-step searching mode; in addition, compound CAS number search or non-compound CAS number search is mainly adopted in the second search, so that compounds or commodities matched with the keywords can be matched, and the search of the chemical industry is realized; the invention is used for searching the commodities, shops and brands related to the compound and providing comprehensive inquiry for various chemical suppliers and service providers.
drawings
Fig. 1 is a flowchart of a chemical industry-based search method according to an embodiment of the present invention;
Fig. 2 is a search flow chart based on the chemical industry according to an embodiment of the present invention.
Detailed Description
The following describes in more detail embodiments of the present invention with reference to the schematic drawings. The advantages and features of the present invention will become more apparent from the following description. It is to be noted that the drawings are in a very simplified form and are not to precise scale, which is merely for the purpose of facilitating and distinctly claiming the embodiments of the present invention.
Hereinafter, if the method described herein comprises a series of steps, the order of such steps presented herein is not necessarily the only order in which such steps may be performed, and some of the described steps may be omitted and/or some other steps not described herein may be added to the method.
at present, in the prior art, there is no introduction on how to search in the chemical industry, and there is no related technology to improve the search rate in the chemical industry. Therefore, it is necessary to provide a chemical industry-based search method, as shown in fig. 1, fig. 1 is a flowchart of a chemical industry-based search method provided in an embodiment of the present invention, where the chemical industry-based search method includes the following steps:
s1: preparing data, and establishing a brand library, a shop library, a compound library and a commodity library which are stored in a cluster manner;
S2: receiving an input search description, and splitting the search description into a plurality of keywords;
S3: performing a first search for brands and/or stores in the brand repository and the store repository that match the plurality of keywords; if the first search is that at least one successfully matched brand and/or store exists, caching the ID of the successfully matched brand and/or the ID of the store, and taking the successfully matched brand and all attributes thereof and/or the store and all attributes thereof as search results;
s4: performing a second search, wherein the second search uses a compound CAS number search or a non-compound CAS number search to search the commodity library for commodities matched with the keywords; if the second search is that the commodities successfully matched with the keywords exist, caching the successfully matched commodities and compounds contained in the commodities, and taking all attributes of the successfully matched commodities as search results;
s5: if the first search is successful, the cached brand ID and/or store ID is brought into the second search, and if the second search is successful, feedback information obtained by sorting the second search result is fed back; if the second search is unsuccessful, feeding back a first search result, wherein the first search result comprises at least one brand successfully matched and all attributes thereof and/or at least one shop and all attributes thereof;
if the first search matching is unsuccessful, performing second search, and if the second search matching is successful, feeding back feedback information obtained by sorting the second search result; and if the second search is unsuccessful, feeding back a search result.
the invention stores each database in a cluster mode, relieves the storage pressure, reduces the search pressure of each database by adopting a two-step search mode and improves the search rate; in addition, compound CAS number search or non-compound CAS number search is mainly adopted in the second search, so that compounds or commodities matched with the keywords can be matched, and the search of the chemical industry is realized; the invention is used for searching the commodities, shops and brands related to the compound and providing comprehensive inquiry for various chemical suppliers and service providers.
Further, before searching, preparing data for data storage by server clustering, wherein the storing of the brand library, the store library, the compound library and the commodity library in clustering comprises the following steps:
acquiring data in a shop system, a brand system, a commodity system, a compound system and a marketing system through middleware;
The middleware transmits the data to the search engine;
the search engine integrates the received data according to a specified format;
Sending the integrated data to the brand repository, the store repository, the compound repository, and the commodity repository in the cluster.
wherein the store marketing data and the commodity marketing data are obtained through a marketing system.
during the storage process, the search engine integrates the data in the brand library, the shop library, the compound library and the commodity library according to a specified format, so that the data in the brand library, the shop library, the compound library and the commodity library comprises the following contents:
the brand library includes a plurality of brands and their attributes, including brand ID, brand name, brand identification (e.g., brand logo), and a brand blacklist;
the store library comprises a plurality of stores and attributes thereof, wherein the attributes of the stores comprise store IDs, store names, store marketing data, store types, store points, store grades, areas where the stores are located, store contacts and store blacklists;
The compound library comprises a plurality of compounds and their attributes, including compound ID, compound name, compound alias, compound CAS number, compound molecular formula, compound molecular weight, compound inci string, compound SMILES string, compound label, class to which the compound belongs, and attribute group of the compound (e.g., title group of compound attribute);
The commodity library includes a plurality of commodities and attributes thereof, and the attributes of the commodities include a commodity ID, a commodity name, a commodity number, commodity marketing data, an ID and search attribute of a brand to which the commodity belongs, an ID and search attribute of a store to which the commodity belongs, an ID and search attribute of a compound contained in the commodity, a commodity specification, a commodity price, a commodity purity, a commodity point, and a commodity shelf life.
wherein all attributes of the brand, the store, the compound and the commodity include a search attribute, and the search attribute is used for matching with the plurality of keywords so as to realize searching; the search attributes of the brand, the store, the compound, and the good are as follows: the search attribute of the brand includes a brand name; the search attribute of the store comprises a store name; the search attributes of the compound include a compound name, a compound alias, a compound CAS number, a compound molecular formula, a compound molecular weight, an INCHI string of the compound, and a SMILES string of the compound; the search attribute of the commodity includes a commodity name and a commodity item number.
Furthermore, because the attributes of the commodities include the attributes of partial brands, shops and combinations, the stored contents are updated by adopting a step-by-step method, and the stored contents are updated in a linkage relationship. The storage update comprises the following steps: if the data in the shop system is updated, updating the content stored in the shop library and also updating the related content in the commodity library; if the data in the brand system is updated, updating the content stored in the brand library and updating the related content in the commodity library; if the data in the compound system is updated, updating the content stored in the compound library and also updating the related content in the commodity library; if the data in the marketing system is updated, updating the content stored in the store library and updating the related content in the commodity library; and if the data in the commodity system is updated, updating the related content in the commodity library.
In the chemical industry-based search method provided by the present invention, the search method is shown in fig. 2, and fig. 2 is a chemical industry-based search flow chart provided by an embodiment of the present invention, and the specific search flow is as follows:
receiving an input search description, and splitting the search description into a plurality of keywords; after the search description is divided into a plurality of keywords, the keywords need to be subjected to escaping and first filtering, escaping is used for escaping the keywords into the keywords which can be matched with the search attributes, for example, the keywords are used for washing clothes, the keywords can be escaped into washing powder, washing liquid or other keywords which are matched with the search attributes, the use of washing clothes is one purpose and cannot be matched with the search attributes, and the washing powder or the washing liquid is the name of a commodity and can be matched with the search attributes. The first filtering is used for filtering out keywords which cannot be matched with the search attributes, for example, the keywords are XX elements in price, addresses are in a certain province, and such keywords cannot be matched with the search attributes and can be directly filtered out. Further, if no keyword matched with the search attribute exists after the first filtering, the search result is directly fed back.
Then, carrying out a first search for searching brands and/or shops matched with the keywords in the brand library and the shop library; if the first search is that at least one successfully matched brand and/or store exists, caching the ID of the successfully matched brand and/or the ID of the store, and taking the successfully matched brand and all attributes thereof and/or the store and all attributes thereof as search results; and if the brand and the shop which are successfully matched with the keywords are not found after the first search, the brand and the shop are not cached, and no search result is found.
Further, if more than one brand and/or store is successfully matched with the keywords, the successfully matched brands and/or stores are sequentially displayed according to the descending order of the matching scores. The specific scoring rule is a complete match score > containing match score > deformed match score, and the specific scoring mechanism can adopt TF/IDF (word frequency/inverse document frequency) algorithm.
and after the first search, filtering the keywords for the first search for the second time, and if the keywords matched with the search attributes do not exist after the second time of filtering, feeding back the first search result, wherein the first search result comprises at least one brand and all attributes thereof which are successfully matched and/or at least one shop and all attributes thereof.
Preferably, if a brand and/or a store name includes a search attribute of a commodity or a compound, the brand and/or the store is provided with a blacklist, and the blacklist includes the search attribute of the commodity or the compound included in the brand and/or the store name;
when the first search is carried out, if the blacklist is successfully matched with any keyword in the keywords, the brand and/or the shop to which the blacklist matched with the keywords belongs cannot be searched, and the keywords matched with the blacklist are not filtered out in the second filtering.
when a second search is subsequently conducted, the second search using either a compound CAS number search or a non-compound CAS number search, comprising the steps of:
If the keywords subjected to the second search only contain the CAS numbers of the compounds with the correct formats, searching by adopting the CAS numbers of the compounds, caching all the commodities if all the commodities to which the compounds corresponding to the CAS numbers of the compounds belong are searched, taking all the searched commodities and all the attributes thereof as search results, and if the compounds are not matched, unsuccessfully searching and matching for the second time;
If the keyword for the second search contains one or more of the name of the compound, the alias of the compound, the molecular formula of the compound, the molecular weight of the compound, the INCI string of the compound, the SMILES string of the compound, the name of the commodity and commodity number search attributes, adopting a non-compound CAS number search, and if the keyword is the search attribute of the commodity and is matched with the plurality of commodities, caching the commodities and the compounds contained in the commodities and taking all the attributes of all the commodities as search results; if the keyword is the search attribute of the compound and is matched with the multiple compounds and all the commodities to which the compounds belong, caching all the commodities, taking all the searched commodities and all the attributes thereof as search results, and if the commodities and the compounds are not matched, searching for the matching for the second time and not matching successfully;
And if the keywords for the second search contain the search attribute of the compound CAS number search and the search attribute of the non-compound CAS number search, performing the compound CAS number search and the non-compound CAS number search, and simultaneously feeding back results of the two searches.
further, if the second search is successful in matching, after the search result is obtained and before the information is fed back, the method further comprises the following steps:
sorting the search results, and classifying all the commodities according to the IDs of the contained compounds, wherein all the commodities with the same ID of the contained compounds are the same piece of feedback information;
And if the second search is successful in matching, feeding back at least one piece of feedback information.
And if the information is fed back, displaying the information in sequence according to the descending order of the compound matching scores. The specific scoring rule is a complete match score > containing match score > deformed match score, and the specific scoring mechanism can adopt TF/IDF (word frequency/inverse document frequency) algorithm.
Finally, the step of feeding back the search results of the first search and the second search comprises the following steps:
if the first search is successful, the cached brand ID and/or store ID is brought into the second search, and if the second search is successful, feedback information obtained by sorting the second search result is fed back; if the second search is unsuccessful, feeding back a first search result, wherein the first search result comprises at least one brand successfully matched and all attributes thereof and/or at least one shop and all attributes thereof;
if the first search matching is unsuccessful, performing second search, and if the second search matching is successful, feeding back feedback information obtained by sorting the second search result; and if the second search is unsuccessful, feeding back a search result.
Further, after the user receives the feedback information, if the user requests to retrieve all the attributes of the compounds contained in the products in the second search result, the user retrieves and feeds back all the attributes of the compounds contained in the products in the second search result.
preferably, in the present invention, when there is no result in the search, the condition of "no result in search" is recorded and analyzed, and if there are many similar "no result in search" and all the similar "no result in search are caused by the same reason, for example, caused by the search logic and/or data, the search logic and/or data are adjusted.
In conclusion, in the searching method based on the chemical industry, the databases are stored in a cluster mode, so that the storage pressure is relieved, the searching pressure of the databases is reduced and the searching speed is increased by adopting a two-step searching mode; in addition, compound CAS number search or non-compound CAS number search is mainly adopted in the second search, so that compounds or commodities matched with the keywords can be matched, and the search of the chemical industry is realized; the method is used for searching the commodities, shops and brands related to the compound, providing comprehensive query for various chemical suppliers and service providers, and returning data interested by the user to the maximum extent by scoring the search result.
The above description is only a preferred embodiment of the present invention, and does not limit the present invention in any way. It will be understood by those skilled in the art that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (12)

1. A search method based on chemical industry is characterized by comprising the following steps:
preparing data, and establishing a brand library, a shop library, a compound library and a commodity library which are stored in a cluster manner;
receiving an input search description, and splitting the search description into a plurality of keywords;
performing a first search for brands and/or stores in the brand repository and the store repository that match the plurality of keywords; if the first search is that at least one successfully matched brand and/or store exists, caching the ID of the successfully matched brand and/or the ID of the store, and taking the successfully matched brand and all attributes thereof and/or the store and all attributes thereof as search results;
performing a second search, wherein the second search uses a compound CAS number search or a non-compound CAS number search to search the commodity library for commodities matched with the keywords; if the second search is that the commodities successfully matched with the keywords exist, caching the successfully matched commodities and compounds contained in the commodities, and taking all attributes of the successfully matched commodities as search results;
If the first search is successful, the cached brand ID and/or store ID is brought into the second search, and if the second search is successful, feedback information obtained by sorting the second search result is fed back; if the second search is unsuccessful, feeding back a first search result, wherein the first search result comprises at least one brand successfully matched and all attributes thereof and/or at least one shop and all attributes thereof;
if the first search matching is unsuccessful, performing second search, and if the second search matching is successful, feeding back feedback information obtained by sorting the second search result; and if the second search is unsuccessful, feeding back a search result.
2. the chemical industry based search method of claim 1, wherein the brand repository, the store repository, the compound repository, and the commodity repository are stored in a clustered manner;
The brand library comprises a plurality of brands and attributes thereof, wherein the attributes of the brands comprise brand IDs, brand names, brand identifications and brand blacklists;
The store library comprises a plurality of stores and attributes thereof, wherein the attributes of the stores comprise store IDs, store names, store marketing data, store types, store points, store grades, areas where the stores are located, store contacts and store blacklists;
the compound library comprises a plurality of compounds and attributes thereof, wherein the attributes of the compounds comprise compound ID, compound name, compound alias, compound CAS number, compound molecular formula, compound molecular weight, compound INCI string, compound SMILES string, compound label, class to which the compound belongs, and attribute group of the compound;
the commodity library includes a plurality of commodities and attributes thereof, and the attributes of the commodities include a commodity ID, a commodity name, a commodity number, commodity marketing data, an ID and search attribute of a brand to which the commodity belongs, an ID and search attribute of a store to which the commodity belongs, an ID and search attribute of a compound contained in the commodity, a commodity specification, a commodity price, a commodity purity, a commodity point, and a commodity shelf life.
3. The chemical industry based search method of claim 2, wherein during the search, the plurality of keywords are matched with search attributes in the brand repository, the store repository, the compound repository, and the commodity repository;
the search attribute of the brand includes a brand name;
The search attribute of the store comprises a store name;
the search attributes of the compound include a compound name, a compound alias, a compound CAS number, a compound molecular formula, a compound molecular weight, an INCHI string of the compound, and a SMILES string of the compound;
The search attribute of the commodity includes a commodity name and a commodity item number.
4. the chemical industry based search method of claim 3, further comprising, after splitting the search description into a plurality of keywords, before performing a first search, the steps of:
and conducting escape and first filtering on the keywords, wherein the escape is used for escaping the keywords into the keywords which can be matched with the search attributes, and the first filtering is used for filtering the keywords which cannot be matched with the search attributes.
5. the chemical industry based search method of claim 4, wherein if there are no keywords matching the search attribute after the first filtering, the feedback search has no result.
6. the chemical industry based search method of claim 3, wherein after the first search, the keywords are filtered for the second time, the second time filters the keywords applied to the first search, and if no keyword matching the search attributes is found after the second time, the first search results are fed back, including at least one brand and all attributes thereof matching successfully and/or at least one store and all attributes thereof.
7. The chemical industry based search method of claim 6, wherein if a brand and/or a store name comprises a search attribute of a commodity or a compound, the brand and/or the store is provided with a blacklist, and the blacklist comprises the search attribute of the commodity or the compound contained in the brand and/or the store name;
When the first search is carried out, if the blacklist is successfully matched with any keyword in the keywords, the brand and/or the shop to which the blacklist matched with the keywords belongs cannot be searched, and the keywords matched with the blacklist are not filtered out in the second filtering.
8. The chemical industry based search method according to claim 3, wherein when the first search is performed, if more than one brand and/or store is successfully matched with the plurality of keywords, the successfully matched brands and/or stores are sequentially displayed according to a descending order of matching scores.
9. The chemical industry based search method of claim 3, wherein the second search comprises the following steps:
if the keywords subjected to the second search only contain the CAS numbers of the compounds with the correct formats, searching by adopting the CAS numbers of the compounds, caching all the commodities if all the commodities to which the compounds corresponding to the CAS numbers of the compounds belong are searched, taking all the searched commodities and all the attributes thereof as search results, and if the compounds are not matched, unsuccessfully searching and matching for the second time;
if the keyword for the second search contains one or more of the name of the compound, the alias of the compound, the molecular formula of the compound, the molecular weight of the compound, the INCI string of the compound, the SMILES string of the compound, the name of the commodity and commodity number search attributes, adopting a non-compound CAS number search, and if the keyword is the search attribute of the commodity and is matched with the plurality of commodities, caching the commodities and the compounds contained in the commodities and taking all the attributes of all the commodities as search results; if the keyword is the search attribute of the compound and is matched with the multiple compounds and all the commodities to which the compounds belong, caching all the commodities, taking all the searched commodities and all the attributes thereof as search results, and if the commodities and the compounds are not matched, searching for the matching for the second time and not matching successfully;
and if the keywords for the second search contain the search attribute of the compound CAS number search and the search attribute of the non-compound CAS number search, performing the compound CAS number search and the non-compound CAS number search, and simultaneously feeding back results of the two searches.
10. the chemical industry based search method of claim 9, wherein if the second search is successful, after the search result is obtained and before the information is fed back, the method further comprises the following steps:
Sorting the search results, and classifying all the commodities according to the IDs of the contained compounds, wherein all the commodities with the same ID of the contained compounds are the same piece of feedback information;
And if the second search is successful in matching, feeding back at least one piece of feedback information.
11. the chemical industry based search method as claimed in claim 1, wherein after the user receives the feedback information, if the user requests to retrieve all the attributes of the compounds contained in the commodities in the second search result, the user retrieves and feeds back all the attributes of the compounds contained in the commodities in the second search result.
12. the chemical industry based search method of claim 1, wherein the brand library, the store library, the compound library, and the commodity library are stored in a clustered manner comprising the steps of:
Acquiring data in a shop system, a brand system, a commodity system, a compound system and a marketing system through middleware;
the middleware transmits the data to the search engine;
the search engine integrates the received data according to a specified format;
Sending the integrated data to the brand repository, the store repository, the compound repository, and the commodity repository in the cluster.
CN201910780588.XA 2019-08-22 2019-08-22 search method based on chemical industry Pending CN110569421A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910780588.XA CN110569421A (en) 2019-08-22 2019-08-22 search method based on chemical industry

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910780588.XA CN110569421A (en) 2019-08-22 2019-08-22 search method based on chemical industry

Publications (1)

Publication Number Publication Date
CN110569421A true CN110569421A (en) 2019-12-13

Family

ID=68775800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910780588.XA Pending CN110569421A (en) 2019-08-22 2019-08-22 search method based on chemical industry

Country Status (1)

Country Link
CN (1) CN110569421A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117520686A (en) * 2023-11-20 2024-02-06 广州方舟信息科技有限公司 Search preloading method, device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040006559A1 (en) * 2002-05-29 2004-01-08 Gange David M. System, apparatus, and method for user tunable and selectable searching of a database using a weigthted quantized feature vector
CN102929907A (en) * 2012-08-17 2013-02-13 上海泰坦科技有限公司 Hand-drawn type chemical molecular structural formula searching method
CN103049542A (en) * 2012-12-27 2013-04-17 北京信息科技大学 Domain-oriented network information search method
WO2018103642A1 (en) * 2016-12-05 2018-06-14 Patsnap Systems, apparatuses, and methods for searching and displaying information available in large databases according to the similarity of chemical structures discussed in them

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040006559A1 (en) * 2002-05-29 2004-01-08 Gange David M. System, apparatus, and method for user tunable and selectable searching of a database using a weigthted quantized feature vector
CN102929907A (en) * 2012-08-17 2013-02-13 上海泰坦科技有限公司 Hand-drawn type chemical molecular structural formula searching method
CN103049542A (en) * 2012-12-27 2013-04-17 北京信息科技大学 Domain-oriented network information search method
WO2018103642A1 (en) * 2016-12-05 2018-06-14 Patsnap Systems, apparatuses, and methods for searching and displaying information available in large databases according to the similarity of chemical structures discussed in them

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
张立坤等: "基于Web的化学结构搜索法的研究", 《计算机与应用化学》 *
李海波等: "Internet上多来源MSDS的统一检索方法", 《计算机与应用化学》 *
蒋丽红等: "基于随机映射的气相色谱-质谱库搜索结果集提取", 《安徽工业大学学报(自然科学版)》 *
陆真等: "互联网化学信息资源查询系统的设计与制作", 《计算机与应用化学》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117520686A (en) * 2023-11-20 2024-02-06 广州方舟信息科技有限公司 Search preloading method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US10032207B2 (en) Product placement engine and method
CN102402604B (en) Effective forward ordering of search engine
US20090327249A1 (en) Intellegent Data Search Engine
US8078601B1 (en) Determining unambiguous geographic references
US9280561B2 (en) Automatic learning of logos for visual recognition
JP4035685B2 (en) System and method for correcting spelling errors in search queries
US9483530B1 (en) Determining query terms of little significance
JP4593855B2 (en) System and method for personalized information filtering and alert generation
US20010044791A1 (en) Automated adaptive classification system for bayesian knowledge networks
US20130332441A1 (en) Systems and Methods for Identifying Terms Relevant to Web Pages Using Social Network Messages
US10296622B1 (en) Item attribute generation using query and item data
CN105701216A (en) Information pushing method and device
US20070033229A1 (en) System and method for indexing structured and unstructured audio content
JP2008544377A (en) A system for generating relevant search queries
JP2013531289A (en) Use of model information group in search
WO2001069428A1 (en) System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising
JP2000348041A (en) Document retrieval method, device therefor and mechanically readable recording medium
US20120284283A1 (en) Information Processing Method, Apparatus, and Computer Program
CN103853802B (en) Device and method for indexing digital content
CN104424342A (en) Method for keyword matching, and device, server and system of method
JP2003173280A (en) Apparatus, method and program for generating database
US7949576B2 (en) Method of providing product database
JP2016509703A (en) System and method for retrieving labeled primarily non-text items
CN110569421A (en) search method based on chemical industry
CN116401459A (en) Internet information processing method, system and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20230523