Invention content
This specification embodiment is intended to provide a kind of built in cloud search platform and indexes and scan for more effective
Scheme, to solve deficiency in the prior art.
To achieve the above object, this specification provides a kind of method building index in cloud search platform on one side,
The cloud search platform includes the search example for multiple tenants, and described search example includes suitable for the multiple tenant
The unified field definition table of each tenant, described search example are that multiple tenants are respectively assigned tenant identification, the tenant
Mark corresponds to tenant for unique mark, and the field definition table includes the description to tenant identification field, tenant's mark
Character learning section is associated with the tenant identification, and the method is executed and included the following steps by the cloud search platform:Obtain tenant
The content of document, tenant's document includes tenant identification field row, and the tenant identification field row shows tenant's text
The tenant identification of shelves tenant;Tenant's dictionary of the tenant is obtained by the tenant identification;According to tenant's dictionary
Tenant's document is segmented, document has been segmented to which acquisition is corresponding with tenant's document;And in described search
In example, index is established to tenant's document according to the field definition table and the document that segmented, including, according to
The description to the tenant identification field in the field definition table, establishes the tenant identification field and tenant's document
Index relative.
In one embodiment, the method for index is built in cloud search platform above-mentioned, the cloud search platform is also
Including dictionary unit, the dictionary unit is detached with described search example, and the dictionary unit includes the multiple rent
Respective tenant's dictionary at family, the method further include, and after obtaining tenant's document, tenant's document are sent to described
Dictionary unit.
In one embodiment, the method for index is built in cloud search platform above-mentioned, according to tenant's dictionary
Tenant's document is segmented, includes to obtain the document that segmented corresponding with tenant's document:In the dictionary
In unit, tenant's document is segmented according to tenant's dictionary, to generate divided corresponding with tenant's document
Word document;And from the dictionary unit receive described in segmented document.
In one embodiment, the above-mentioned method that index is built in cloud search platform further includes, according to the tenant
Dictionary segments tenant's document, will be described after having segmented document to which acquisition is corresponding with tenant's document
It tenant's document and its corresponding described segmented document and is sent to described search example.
In one embodiment, the method for index is built in cloud search platform above-mentioned, the cloud search platform is also
Including unified service interface, the service interface is connect with tenant's platform of the multiple tenant, and, wherein obtaining tenant
Document includes receiving tenant's original document from tenant's platform by the service interface, being obtained according to tenant's platform
The tenant identification, and increase the tenant identification field row in the content of tenant's original document, to obtain
State tenant's document.
In one embodiment, the method for index is built in cloud search platform above-mentioned, the cloud search platform is also
Including unified service interface, the service interface is connect with tenant's platform of the multiple tenant, and, described in offline progress
Method, and the method further includes, and before obtaining tenant's document, is received from tenant's platform by the service interface
Tenant's original document, and the tenant identification is obtained according to tenant's platform, increase in the content of tenant's original document
Add the tenant identification field row, to generate tenant's document, and tenant's document is stored in the cloud search and is put down
In platform.
In one embodiment, the method for index is built in cloud search platform above-mentioned, the cloud search platform is also
Including unified service interface, the service interface is connect with tenant's platform of the multiple tenant, and the method is also wrapped
It includes:Before obtaining tenant's document, tenant's dictionary is received from tenant's platform by the service interface, according to the tenant
Platform obtains the tenant identification, and tenant's dictionary and the tenant identification are associatedly stored in the dictionary unit
In.
On the other hand this specification provides a kind of device building index in cloud search platform, the cloud search platform packet
The search example for multiple tenants is included, described search example includes the unification of each tenant suitable for the multiple tenant
Field definition table, described search example are that multiple tenants are respectively assigned tenant identification, and the tenant identification is used for unique mark
Corresponding tenant, the field definition table include the description to tenant identification field, the tenant identification field and the tenant
Mark association, described device are implemented by the cloud search platform and include with lower unit:First acquisition unit is configured to, and is obtained
The content of tenant's document, tenant's document includes tenant identification field row, and the tenant identification field row shows the rent
The tenant identification of family document tenant;Second acquisition unit is configured to, and the rent of the tenant is obtained by the tenant identification
Family dictionary;Participle unit is configured to, and is segmented to tenant's document according to tenant's dictionary, to obtain with it is described
Tenant's document is corresponding to have segmented document;And unit is established, and it is configured to, it is fixed according to the field in described search example
Adopted table and the document that segmented establish index to tenant's document, including according to pair in the field definition table
The index relative of the tenant identification field and tenant's document is established in the description of the tenant identification field.
On the other hand this specification provides a kind of method scanned in cloud search platform, the cloud search platform packet
The search example for multiple tenants is included, described search example includes the unified field definition suitable for the multiple tenant
Table, described search example are that multiple tenants are respectively assigned tenant identification, and the tenant identification corresponds to tenant for unique mark,
The field definition table includes the description to tenant identification field, and the tenant identification field is associated with the tenant identification,
The cloud search platform further includes unified service interface, and the service interface is connect with tenant's platform of the multiple tenant,
The method is executed and is included the following steps by the cloud search platform:Search statement is received from tenant's platform;From the tenant
Platform obtains the tenant identification of tenant;Tenant's dictionary of the tenant is obtained by the tenant identification;According to tenant's word
Allusion quotation segments described search sentence, and sentence has been segmented to which acquisition is corresponding with described search sentence;In described search reality
Example in the tenant identification field and the sentence that segmented are retrieved, in tenant's document of the tenant to described
Sentence has been segmented to be retrieved;Tenant's platform is positioned according to the tenant identification;And according to the field definition table to
Tenant's platform returns to retrieval result.
In one embodiment, in the above-mentioned method scanned in cloud search platform, the cloud search platform is also
Including dictionary unit, the dictionary unit is detached with described search example, and the dictionary unit includes the multiple rent
Respective tenant's dictionary at family, the method further include, will after the tenant identification for obtaining tenant according to tenant's platform
Described search sentence and tenant identification are sent to the dictionary unit.
In one embodiment, in the above-mentioned method scanned in cloud search platform, according to tenant's dictionary
Described search sentence is segmented, having segmented sentence to which acquisition is corresponding with described search sentence includes:In the dictionary
In unit, described search sentence is segmented by tenant's dictionary, sentence has been segmented to generate;And from the dictionary
Unit has segmented sentence described in receiving.
In one embodiment, the above-mentioned method scanned in cloud search platform further includes, according to the tenant
Dictionary segments described search sentence, will be described after having segmented sentence to which acquisition is corresponding with described search sentence
It has segmented sentence and the tenant identification is sent to described search example.
On the other hand this specification provides a kind of device scanned in cloud search platform, the cloud search platform packet
The search example for multiple tenants is included, described search example includes the unification of each tenant suitable for the multiple tenant
Field definition table, described search example are that multiple tenants are respectively assigned tenant identification, and the tenant identification is used for unique mark
Corresponding tenant, the field definition table include the description to tenant identification field, the tenant identification field and the tenant
Mark association, the cloud search platform further include unified service interface, the tenant of the service interface and the multiple tenant
Platform connects, and described device is implemented by the cloud search platform and includes with lower unit:First receiving unit, is configured to, from institute
It states tenant's platform and receives search statement;First acquisition unit is configured to, and the tenant identification of tenant is obtained from tenant's platform;
Second acquisition unit is configured to, and tenant's dictionary of the tenant is obtained by the tenant identification;Participle unit is configured to, root
Described search sentence is segmented according to tenant's dictionary, sentence has been segmented to which acquisition is corresponding with described search sentence;
Retrieval unit is configured to, and is retrieved to the tenant identification field and the sentence that segmented in described search example, with
The sentence that segmented is retrieved in tenant's document of the tenant;Positioning unit is configured to, and is marked according to the tenant
Know and positions tenant's platform;And returning unit, it is configured to, is returned and examined to tenant's platform according to the field definition table
Hitch fruit.
In the above-mentioned method and dress that structure is indexed and scanned in cloud search platform according to this specification embodiment
In setting, by being handled according to unified logic multiple tenants in individually search example, and tenant's dictionary is independent
To search for the service of Example external, the complexity of entire framework is simplified, reduces development cost, and entire frame can be reduced
The memory space that structure needs.In addition, this programme can realize the demand of multi-tenant Custom Dictionaries, and can be by increasing server etc.
Computing resource reaches the linear increase for supporting tenant's quantity, i.e. system is " linear expansible ".
Specific implementation mode
This specification embodiment is described below in conjunction with attached drawing.
Fig. 1 schematically illustrates the application scenarios of this specification embodiment.The application scenarios of this specification embodiment include cloud
Search platform 101 and tenant's platform of multiple tenants 102,103,104 etc..Cloud search platform 101 includes unified service interface
For being connect with each tenant's platform, which is, for example, restful service interfaces.For example, cloud search platform 101 can lead to
Cross service interface from tenant's platform 103 receive the document of tenant, tenant Custom Dictionaries etc., to according to the document of tenant and
Custom Dictionaries structure index.When the user of tenant's platform 103 is scanned on tenant's platform 103 using search engine,
The search statement of its user is sent to cloud search platform 101 by tenant's platform 103 by service interface.Cloud search platform 101 is logical
The field row for crossing such as " tenant ID=xxx " tenant identification of tenant (wherein xxx for) is distinguished tenant and is retrieved, and to rent
Family platform 103 returns to retrieval result.
The cloud search platform includes such search example (cluster), includes multiple tenants in the search example,
Described search example includes the unified field definition table (schema) of each tenant suitable for the multiple tenant.That is, more
The shared search example of a tenant, uses identical field definition table.Here search example is the use in cloud search platform
In the independent utility for realizing function of search, searches between example and search example and be logically mutually isolated.
Field definition table (schema) for example can be the form of schema.xml configuration files, and it includes all documents can
The field (Field) that can include and all information that how these fields will be handled when establishing document index and inquiry.
For example, the document (that is, raw doc) of initial data generates in the following format:
Id=1
User_id=001
Title=xxxxxxx yyyyy
Content=...origin_title=xxxxxyyyyy
Wherein id, user_id, title, content are exactly the field for including in the document, wherein in "=" is subsequent
Appearance is the value of corresponding field.
Table 1 schematically illustrates the field definition table of a search example.
Table 1
Table 1 lists the field for including in the document handled in searching for example:Title (title after participle),
Content (text), cat_id (classification id), user_id (tenant id), origin_title (original header).Table 1 also records
The information that how these fields will be handled when establishing document index and inquiry.For example, in title this row, engine word
" row, positive row " in this row of section indicates, reverse index and forward index will be established to title, and, whether needing to segment
It shows in this column and title is segmented by space when indexing.For another example in content this row, engine field
" abstract fields " in this row indicate, when showing query result, text is shown in the form of abstract.Certainly, the word in table 1
Duan Dingyi tables are only exemplary, and are only intended to illustrate field definition table, and the field definition in practical application
Table can include more fields and can include different field definitions.
Search example can read above-mentioned field definition table when the document for each tenant builds index, and according to table
In description structure index.In addition, when returning to search result according to tenant's searching request, search example also can be fixed according to field
Description in adopted table shows each field.
Described search example is that multiple tenants therein are respectively assigned tenant identification, and the tenant identification is for uniquely marking
Know corresponding tenant.Specifically, search example distinguishes difference by a field (for example, user_id) in field definition table
Tenant, for example, " user_id=001 " expression is first tenant, " user_id=002 " indicates second tenant.The rent
Family mark can also be used to distinguish the document of tenant, the dictionary of tenant, tenant's platform etc..For example, by by the dictionary of tenant with
Tenant identification is associated with, and when using tenant's dictionary, can obtain tenant's dictionary by tenant identification.It is carried out when in searching for example
When search, the request of different tenants is distinguished by including " user_id=xxx " in searching request, so as to accomplish to rent
It is mutually noiseless between family.For example, specific search command (query) can be:Query=title:Weather AND user_id:
Xxx, this search statement are meant that the index for containing " weather " in search title, index while needing the condition met to be
User_id is xxx, is equivalent to filter according to tenant's dimension in this way.
It will be seen that, the field definition table that multiple tenants in a search example use is consistent from the description above, but
It is that the dictionary of different tenants is different, for example, the article of certain sport categories requires to use the proprietary dictionary of sport, certain medicines
Requirement use Medical Dictionary.Example is searched for when building the index of tenant, according to the respective dictionary of tenant to the original of tenant
Document is segmented, and is indexed for structure.In this specification one embodiment, the relevant participle logic of tenant is placed on and is searched
It is realized except rope example, the functions such as general storage, index, retrieval is only carried out to search for example, without carrying out complicated rent
Family dictionary is self-defined.
Specifically, as shown in Figure 1, further including dictionary unit 12, service agent unit 13 and storage in cloud search platform 101
Unit 14.Dictionary unit 12 is detached with search example 11, and the dictionary unit includes the respective of the multiple tenant
Tenant's dictionary, the dictionary unit 12 can provide Chinese Word Segmentation Service searching for except example to each tenant.Specifically, when for coming
When calling the service of dictionary unit 12 from the document of tenant's platform 102, which uses tenant's dictionary pair of the tenant
The document of tenant segments.
Service agent unit 13 is used to act on behalf of the business of cloud search platform 101, and provides http services upwards, respectively with
The connections such as tenant's platform (102,103,104 etc.), search example 11, dictionary unit 12, storage unit 14, with transfer in-between
Data, and carry out data prediction appropriate.For example, in the case of offline structure index, service agent unit 13 is from tenant
Platform 103 receives the original document of tenant, and tenant identification is added in the original document, to obtain tenant's document, and by the rent
In the document storage to storage unit 14 of family.When building index, service agent unit 13 obtains tenant's document from storage unit 14,
The document is sent to dictionary unit 12 to segment, document has been segmented from the reception of dictionary unit 12.Then, service agent unit
13 by tenant's document and have segmented document and are sent to search example 11, and search example 11 has divided according to field definition table and tenant's
Word document is established tenant's document and is indexed, and wherein the index includes the field index of tenant ID.
The composition of cloud search platform 101 shown in FIG. 1 is one embodiment of this specification, does not limit this specification
Embodiment.In another embodiment, in cloud search platform, by Dictionary based segment function setting inside search example.To,
Tenant's document only is sent to search example, and tenant's document is segmented according to tenant's dictionary by search example.At another
In embodiment, in cloud search platform, search example directly (that is, not passing through service agent unit) calls external dictionary unit
Service, that is, directly segmented document and tenant's document from the acquisition of dictionary unit, indexed for structure.
The method and apparatus that index is built in cloud search platform according to this specification one embodiment are described below.Figure
2 show the method that index is built in cloud search platform according to this specification one embodiment.
As shown in Fig. 2, in step S21, tenant's document is obtained, the content of tenant's document includes tenant identification field
Row, the tenant identification field row show the tenant identification of tenant's document tenant.As it was noted above, the cloud search
Platform includes unified service interface, and the service interface is connect with tenant's platform of the multiple tenant.Pass through the service
Interface receives tenant's original document from tenant's platform.
For example, in the case of real-time structure index, tenant's original is received from tenant's platform by the service interface
Beginning document, and tenant identification " xxx " is obtained according to tenant's platform, increase field in the content of tenant's original document
Row " tenant ID=xxx ", to obtain tenant's document.Wherein, for example, when cloud search platform is flat from the tenant of tenant 001
When platform receives tenant's original document, the parameter of such as type=user_001 can be received from tenant's platform simultaneously, so as to
Obtain the tenant identification " xxx " of tenant, such as " 001 ".And in the case of offline structure index, by the service interface from
Tenant's platform receives tenant's original document, and obtains tenant identification " xxx " according to tenant's platform, in tenant original
Increase field row " tenant ID=xxx " row in the content of beginning document, to generate tenant's document, and by tenant's document
It is stored in the cloud search platform.To when offline structure index, from the storage unit of cloud search platform described in acquisition
Tenant's document.
In one embodiment, it as shown in Figure 1, including service agent unit in cloud search platform, is executed for acting on behalf of
Service logic in platform.Be provided with above-mentioned unified service interface on the service agent unit, the service interface with it is multiple
Tenant's platform of tenant connects.And the service agent unit is also connect with the storage unit of the cloud search platform.For example,
In the case of real-time structure index, the service agent unit receives tenant by the service interface from tenant's platform
Original document, and tenant identification " xxx " is obtained according to tenant's platform, increase word in the content of tenant's original document
Section row " tenant ID=xxx ", to obtain tenant's document.And in the case of offline structure index, the service agent
Unit receives tenant's original document by the service interface from tenant's platform, and obtains tenant according to tenant's platform
It identifies " xxx ", increases field row " tenant ID=xxx " row in the content of tenant's original document, to generate the rent
Family document, and tenant's document is stored in the storage unit 14 of the cloud search platform.To when offline structure index
When, the service agent unit obtains tenant's document from storage unit 14.
In step S22, tenant's dictionary of the tenant is obtained by the tenant identification.The cloud search platform also passes through
The service interface receives tenant's dictionary from tenant's platform, obtains tenant identification according to tenant's platform, and will be described
Tenant's dictionary is associatedly stored in the tenant identification in the dictionary unit.To, when building index to tenant's document,
Can be proposed from platform by tenant identification with the associated tenant's dictionary of tenant identification, and obtain tenant's dictionary.
In step S23, tenant's document is segmented according to tenant's dictionary, to obtain and tenant text
Shelves are corresponding to have segmented document.For example, entitled " Beijing group buying websites are long " of original document, in tenant's dictionary of tenant 001
Including entry " purchasing by group " and " website ", then according to tenant's dictionary of tenant 001 to the title segmented the result is that " Beijing group
It is long to purchase website ".For another example tenant's dictionary of tenant 002 includes entry " Beijing purchases by group net " and " head of a station ", then according to tenant
002 tenant's dictionary to the title segmented the result is that " Beijing group buying websites are long ".In above-mentioned word segmentation result, with space
As the separation between participle, this is merely exemplary, and separation, or table in other forms can also be used as by other characters
Show participle, such as is each segmented by structuring.
In one embodiment, it can also be superimposed in dictionary unit and tenant's document is carried out further using acquiescence dictionary
Participle.In this case, preferentially using the entry in tenant's dictionary.
In one embodiment, as shown in Figure 1, further including dictionary unit in cloud search platform, which is flat
The application detached with described search example on platform, is used to provide Chinese Word Segmentation Service.
Service agent unit receives tenant's dictionary by service interface from tenant's platform, is obtained and is rented according to tenant's platform
Family identifies, and tenant's dictionary and the tenant identification are associatedly stored in the dictionary unit.For example, can rent
The configuration file of one startup is set in the dictionary of family, and an option of configuration file is key:Dict_path, it is specific to be, for example,
user_id_001:/ home/admin/local_dict_1.txt, to be marked tenant's dictionary and tenant by the configuration file
Knowledge associates.
To which when building index to tenant, the Chinese Word Segmentation Service of service agent cell call dictionary unit can be passed through.Example
Such as, tenant's document is sent to dictionary unit by service agent unit after obtaining tenant's document.Dictionary unit is logical
Cross field row " tenant ID=xxx " in tenant's document and obtain tenant identification, and is proposed from its own by tenant identification and
Tenant's dictionary of tenant identification associated storage, and obtain tenant's dictionary.Then, in the dictionary unit, according to the rent
Family dictionary segments tenant's document, and document has been segmented to generate.Later, the dictionary unit has been by tenant's document and
Participle document is sent to service agent unit.
In step S24, in described search example, according to the field definition table and the document that segmented to the rent
Family document establishes index, including according to the description to the tenant identification field in the field definition table, establishing institute
State the index relative of tenant identification field and tenant's document.For example, the field definition table in search example is as shown in table 1
Table, show in table to establishing the row of falling by the title (title), classification id (cat_id), tenant ID (user_id) of participle
Index and forward index.By taking tenant ID as an example, according to the description to tenant's id field in field definition table, the rope in example is searched for
Lead device generates the index relative table of tenant ID and tenant's document, including inverted list and positive row's table.
As shown in Figure 1, in one embodiment, when literary to the tenant according to tenant's dictionary in dictionary unit 12
Shelves are segmented, and by tenant's document and have been segmented after document is sent to service agent unit 13, the general of service agent unit 13
Tenant's document and the document that segmented are sent to described search example 11.The content segmented, can be by making an appointment
Separator (such as space) be separated, be not in dictionary unit in the case of participle to document with the separator,
It can further be handled having segmented document in service agent unit, the document is revised as separating with the agreement
Symbol is segmented.In this case, search example is divided after reception has segmented document according to completed participle field
Word, for example, being segmented according to space, without carrying out additional word segmentation processing.
To, the document of all tenants can be handled all in accordance with unified logic in searching for example, without
Distinguish tenant.That is, the index of search example includes the document of whole tenants, in search, by the way that such as user_id is added
=" xxx " distinguishes the request of different tenants, so as to be isolated between tenant.
Fig. 3 shows a kind of device 300 building index in cloud search platform according to this specification embodiment.It is described
Cloud search platform includes the search example for multiple tenants, and described search example includes the unification suitable for the multiple tenant
Field definition table, described search example is that multiple tenants are respectively assigned tenant identification, and the tenant identification is for uniquely marking
Know corresponding tenant, the field definition table includes the description to tenant identification field, the tenant identification field and the rent
Family mark association.
As shown in figure 3, the device 300 that index is built in cloud search platform implemented by the cloud search platform and include with
Lower unit:First acquisition unit 31, is configured to, and obtains tenant's document, the content of tenant's document includes tenant identification word
Duan Hang, the tenant identification field row show the tenant identification of tenant's document tenant;Second acquisition unit 32, configuration
To obtain tenant's dictionary by the tenant identification;Participle unit 33, is configured to, according to tenant's dictionary to the tenant
Document is segmented, and document has been segmented to which acquisition is corresponding with tenant's document;And unit 34 is established, it is configured to,
In described search example, index is established to tenant's document according to the field definition table and the document that segmented, wherein
Including according to the description to the tenant identification field in the field definition table, establishing the tenant identification field and institute
State the index relative of tenant's document.
In one embodiment, the cloud search platform further includes dictionary unit, and the dictionary unit is real with described search
Example separation, and the dictionary unit includes respective tenant's dictionary of the multiple tenant, it is described in cloud search platform
The device 300 of structure index further includes that the first transmission unit is configured to, after obtaining tenant's document, by tenant's document
It is sent to the dictionary unit.
In one embodiment, tenant's document is segmented according to tenant's dictionary, to obtain with it is described
The corresponding document that segmented of tenant's document includes:In the dictionary unit, according to tenant's dictionary to tenant's document
It is segmented, document has been segmented so that generation is corresponding with tenant's document;And from the dictionary unit receive described in divided
Word document.
In one embodiment, the device 300 that index is built in cloud search platform further includes the second transmission unit, configuration
To be segmented to tenant's document according to tenant's dictionary, to obtain divided corresponding with tenant's document
After word document, by tenant's document and its corresponding described document segmented and is sent to described search example.
In one embodiment, in the device 300 for building index in cloud search platform, the cloud search platform
Further include unified service interface, the service interface is connect with tenant's platform of the multiple tenant, and, it is rented wherein obtaining
Family document includes receiving tenant's original document from tenant's platform by the service interface, being obtained according to tenant's platform
The tenant identification is taken, and increases the tenant identification field row in the content of tenant's original document, to obtain
Tenant's document.
In one embodiment, the device 300 of index, and the dress are built described in offline implementation in cloud search platform
Setting 300 further includes, and the first storage unit is configured to, before obtaining tenant's document, by the service interface from the tenant
Platform receives tenant's original document, and obtains the tenant identification according to tenant's platform, in tenant's original document
Increase the tenant identification field row in content, to generate tenant's document, and tenant's document is stored in described
In cloud search platform.
In one embodiment, the device 300 that index is built in cloud search platform further includes the second storage unit,
It is configured to, before obtaining tenant's document, tenant's dictionary is received from tenant's platform by the service interface, according to described
Tenant's platform obtains tenant identification, and tenant's dictionary and the tenant identification are associatedly stored in the dictionary unit
In.
Fig. 4 shows the method scanned in cloud search platform according to this specification embodiment.The cloud search
Platform includes the search example for multiple tenants, and described search example includes the unified field suitable for the multiple tenant
Table is defined, described search example is that multiple tenants are respectively assigned tenant identification, and the tenant identification is corresponded to for unique mark
Tenant, the field definition table include the description to tenant identification field, the tenant identification field and the tenant identification
Association, and the cloud search platform includes unified service interface, and the service interface and the tenant of the multiple tenant are flat
Platform connects.
As shown in figure 4, in step S41, search statement is received from tenant's platform.Cloud search platform passes through above-mentioned unified clothes
Business interface receives the searching request of tenant from tenant's platform.For example, the searching request is search statement " Beijing purchases by group ".
In one embodiment, as shown in Figure 1, cloud search platform 101 includes service agent unit 13, service agent list
Member 13 includes the service interface, to be connect with tenant's platform.Service agent unit 13 is connect by above-mentioned unified service
Mouth receives the searching request of tenant from tenant's platform.
In step S42, the tenant identification of tenant is obtained from tenant's platform.For example, each tenant's platform can be in its hair
Comprising the tenant identification parameter corresponding to tenant's platform in the request string sent.For example, tenant's platform of tenant 001 is flat to cloud search
Platform sends the parameter of such as type=user_001.To which the parameter for including in request is gone here and there from tenant's platform can obtain rent
The tenant identification " xxx " at family, such as " 001 ".In one embodiment, as shown in Figure 1, it is flat from tenant by service agent layer 13
Platform obtains the tenant identification " xxx " of tenant.
In step S43, tenant's dictionary of the tenant is obtained by the tenant identification.As described above, it is searched in cloud
Tenant's dictionary is associatedly stored therein with tenant identification in platform, it is thus possible to be proposed from platform by tenant identification
With the associated tenant's dictionary of tenant identification, and tenant's dictionary is obtained.
In one embodiment, as shown in Figure 1, cloud search platform 101 further includes dictionary unit 12.Service agent unit 13
After the tenant identification " xxx " that tenant's platform obtains tenant, described search sentence and tenant identification are sent to dictionary list
Member 12.To which, dictionary unit 12 is proposed by tenant identification from its own and the associated tenant's dictionary of tenant identification, and obtain
Tenant's dictionary.
In step S44, described search sentence is segmented according to tenant's dictionary, to obtain and described search language
Sentence is corresponding to have segmented sentence.For example, when cloud search platform receives search statement " Beijing purchases by group " from tenant's platform of tenant 001
When, tenant's dictionary of tenant 001 includes entry " purchasing by group " and " website ", then according to tenant's dictionary of tenant 001 to the sentence
It is being segmented the result is that " Beijing purchases by group ".For another example when cloud search platform receives search statement from tenant's platform of tenant 002
When " Beijing purchases by group ", tenant's dictionary of tenant 002 includes entry " Beijing purchases by group net ", then according to tenant's dictionary of tenant 002
The result segmented to the sentence is still " Beijing purchases by group ".
In one embodiment, it as shown in Figure 1, in the dictionary unit 12, is searched to described by tenant's dictionary
Rope sentence is segmented, and sentence has been segmented to generate.Later, the dictionary unit 12 has segmented sentence and tenant identification by described
Send back service agent unit 13.Later, service agent unit 13 has segmented sentence and tenant identification is sent to search by described
Example 11.
In step S45, the tenant identification field and the sentence that segmented are retrieved in described search example,
To be retrieved to the sentence that segmented in tenant's document of the tenant.
For example, the document of tenant 001 (that is, tenant ID=001) includes the document of entitled " Beijing group buying websites are long ",
Tenant's dictionary of tenant 001 includes entry " purchasing by group " and " website ", then is carried out to the title according to tenant's dictionary of tenant 001
Participle the result is that " Beijing group buying websites are long ", i.e. the corresponding index entry of the document of tenant 001 includes " Beijing ", " purchasing by group "
" website " and " length ".When receiving the search statement of " Beijing purchases by group " from tenant's platform of tenant 001, pass through the rent of tenant 001
The sentence of participle that family dictionary segments the search statement is " Beijing purchases by group "." Beijing " and " purchasing by group " all in the index with
The document associations of entitled " Beijing group buying websites are long ", and the document is closed with " tenant ID=001 " in the index simultaneously
Join (that is, the document is the document of tenant 001), thus in this case, by searching for example to search statement " Beijing purchases by group "
It is retrieved, the document of entitled " Beijing group buying websites are long " will be returned.
For another example the document of tenant 002 (that is, tenant ID=002) includes the text of entitled " Beijing group buying websites are long "
Shelves, tenant's dictionary of tenant 002 includes entry " Beijing purchases by group net " and " head of a station ", then according to tenant's dictionary pair of tenant 002
It is that the title is segmented the result is that " Beijing group buying websites are long ", i.e. the corresponding index entry of the document of tenant 002 includes " north
Capital purchases by group net ", " head of a station ".When receiving the search statement of " Beijing purchases by group " from tenant's platform of tenant 002, pass through tenant 002
Tenant's dictionary sentence of participle that the search statement is segmented still be " Beijing purchases by group "." Beijing purchases by group " in the index not
With the document associations of entitled " Beijing group buying websites are long ", to, in this case by search for example to search statement
" Beijing purchases by group " is retrieved, and the document of entitled " Beijing group buying websites are long " will not be returned.
In one embodiment, as shown in Figure 1, working as in searching for example 11 according to field row " tenant ID=xxx " and institute
It states and has segmented after sentence retrieved, search example 11 is by the retrieval result and tenant identification that shows according to field definition table hair
Give service agent unit 13.
In step S46, tenant's platform is positioned according to the tenant identification.In this specification embodiment, cloud search
Platform is connect by unified service interface with tenant's platform of multiple tenants, to which when returning the result, platform can pass through rent
Tenant's platform of searching request is sent before the mark location of family.
In one embodiment, as shown in Figure 1, service agent unit 13 passes through the tenant identification from the search reception of example 11
And tenant's platform of searching request is sent before positioning.
Finally, in step S47, retrieval result is returned to tenant's platform according to the field definition table.For example, with reference to
Field definition table shown in Table 1 above is limited to " abstract word in table in " engine field " row to content (text) rows
Section ", that is, when returning to search result, content is shown in abstract fields.It can also be obtained according to the table, in abstract also
Show cat_id (classification id) and origin_title (original header).
It in one embodiment, will be real from search as shown in Figure 1, service agent unit 13 is after positioning tenant's platform
The retrieval result shown according to field definition table that example 11 receives returns to tenant's platform.
Fig. 5 shows the device 500 scanned in cloud search platform according to this specification embodiment.The cloud is searched
Suo Pingtai includes the search example for multiple tenants, and described search example includes each tenant suitable for the multiple tenant
Unified field definition table, described search example be multiple tenants be respectively assigned tenant identification, the tenant identification is used for
Unique mark corresponds to tenant, and the field definition table includes the description to tenant identification field, the tenant identification field with
Tenant identification association, the cloud search platform further include unified service interface, the service interface and the multiple rent
Tenant's platform at family connects.
As shown in figure 5, the device 500 scanned in cloud search platform is implemented and is wrapped by the cloud search platform
It includes with lower unit:First receiving unit 51, is configured to, and search statement is received from tenant's platform;First acquisition unit 52, matches
It is set to, the tenant identification of tenant is obtained from tenant's platform;Second acquisition unit 53, is configured to, and passes through the tenant identification
Obtain tenant's dictionary of the tenant;Participle unit 54, is configured to, and is divided described search sentence according to tenant's dictionary
Word has segmented sentence to which acquisition is corresponding with described search sentence;Retrieval unit 55, is configured to, in described search example
The tenant identification field and the sentence that segmented are retrieved, to have divided described in tenant's document of the tenant
Word sentence is retrieved;Positioning unit 56, is configured to, and tenant's platform is positioned according to the tenant identification;And it returns single
Member 57, is configured to, and retrieval result is returned to tenant's platform according to the field definition table.
In one embodiment, the cloud search platform further includes dictionary unit, and the dictionary unit is real with described search
Example separation, and the dictionary unit includes respective tenant's dictionary of the multiple tenant, it is described in cloud search platform
The device 500 scanned for further includes the first transmission unit, is configured to, and is marked in the tenant for obtaining tenant according to tenant's platform
After knowledge, described search sentence and tenant identification are sent to the dictionary unit.
In one embodiment, in the device 500 scanned in cloud search platform, according to tenant's word
Allusion quotation segments described search sentence, and having segmented sentence to which acquisition is corresponding with described search sentence includes:In institute's predicate
In allusion quotation unit, described search sentence is segmented by tenant's dictionary, sentence has been segmented to generate;And from institute's predicate
Allusion quotation unit has segmented sentence described in receiving.
In one embodiment, the device 500 scanned in cloud search platform further includes that second sends list
Member is configured to, and after receiving the tenant identification from the dictionary unit, sentence and the tenant identification have been segmented by described
It is sent to described search example.
This specification embodiment further includes a kind of computer-readable storage medium, is stored thereon with instruction code, described
When instruction code executes in a computer, enables computer execute and rope is built in cloud search platform according to this specification embodiment
The method drawn and scanned for.
In the above-mentioned method and dress that structure is indexed and scanned in cloud search platform according to this specification embodiment
In setting, by being handled according to unified logic multiple tenants in individually search example, and tenant's dictionary is independent
To search for the service of Example external, the complexity of entire framework is simplified, reduces development cost, and entire frame can be reduced
The memory space that structure needs.In addition, this programme can realize the demand of multi-tenant Custom Dictionaries, and can be by increasing server etc.
Computing resource reaches the linear increase for supporting tenant's quantity, i.e. system is " linear expansible ".
Those of ordinary skill in the art should further appreciate that, be described in conjunction with the embodiments described herein
Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear
Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description
At and step.These functions hold track with hardware or software mode actually, depending on technical solution specific application and set
Count constraints.Those of ordinary skill in the art can be described to be realized using distinct methods to each specific application
Function, but this realization is it is not considered that exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can hold track with hardware, processor
Software module or the combination of the two implement.Software module can be placed in random access memory (RAM), memory, read-only storage
Device (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology neck
In any other form of storage medium well known in domain.
Above-described specific implementation mode has carried out further the purpose of the present invention, technical solution and advantageous effect
It is described in detail, it should be understood that the foregoing is merely the specific implementation mode of the present invention, is not intended to limit the present invention
Protection domain, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.