CN108304412A - A kind of cross-language search method and apparatus, a kind of device for cross-language search - Google Patents
A kind of cross-language search method and apparatus, a kind of device for cross-language search Download PDFInfo
- Publication number
- CN108304412A CN108304412A CN201710025472.6A CN201710025472A CN108304412A CN 108304412 A CN108304412 A CN 108304412A CN 201710025472 A CN201710025472 A CN 201710025472A CN 108304412 A CN108304412 A CN 108304412A
- Authority
- CN
- China
- Prior art keywords
- search result
- translation
- default
- exposition
- translation model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Machine Translation (AREA)
Abstract
An embodiment of the present invention provides a kind of cross-language search method and apparatus, a kind of device for cross-language search, method therein specifically includes:Obtain the search term of the first languages;According to described search word, the search result of the second languages is obtained;For the search result of each second languages, following steps are executed:Determine target translation model corresponding with each default exposition of described search result;Using the target translation model, the corresponding translation search result of each default exposition of described search result is obtained;The corresponding translation search result of each default exposition of described search result is shown to user.The embodiment of the present invention can improve the accuracy of translation search result.
Description
Technical field
The present invention relates to information search technique fields, more particularly to a kind of cross-language search method and apparatus, Yi Zhongyong
In the device of cross-language search.
Background technology
With the continuous growth of internet information, more stringent requirements are proposed for information search by people, is no longer satisfied with
It is searched in same languages database, and requires to obtain a variety of languages data.For example, if search term input by user
(query)) it is " Donald Trump ", then the search in Chinese database can not may farthest meet user demand, be originated from
There may be more excellent, more search results in the Database in English of American-European website.
Cross-language search technology combines information retrieval technique and machine translation mothod.Existing cross-language search scheme
Realization process can specifically include:First, the search term of source language is converted to by object language by machine translation mothod
The search term of form, then, respectively according to source language search term and object language form search term, in corresponding list
Information retrieval is carried out in language database, to obtain multilingual search result, wherein multilingual search result can wrap
It includes:The search result of original language and the search result of object language.
In order to meet the need for the limited user of reading ability for not having object language reading ability or object language
It asks, existing scheme can utilize translation model, be translated to the search result of object language, to obtain turning over for source language
Translate search result.
Inventor has found that at least there are the following problems for existing scheme during implementing the embodiment of the present invention:Existing side
Case generally use general translator model translates the search result of object language, and the limitation of the general translator model is easy
The accuracy for influencing translation search result, that is, the accuracy of the translation search result obtained in existing scheme is relatively low.
Invention content
In view of the above problems, it is proposed that the embodiment of the present invention overcoming the above problem or at least partly in order to provide one kind
Cross-language search method, cross-language search device and the device for cross-language search to solve the above problems, the present invention are implemented
Example can improve the accuracy of translation search result.
To solve the above-mentioned problems, the invention discloses a kind of cross-language search methods, including:
Obtain the search term of the first languages;
According to described search word, the search result of the second languages is obtained;
For the search result of each second languages, following steps are executed:
Determine target translation model corresponding with each default exposition of described search result;
Using the target translation model, the corresponding translation search knot of each default exposition of described search result is obtained
Fruit;
The corresponding translation search result of each default exposition of described search result is shown to user.
Optionally, the step of the corresponding target translation model of each default exposition of the determination and described search result
Suddenly, including:
The corresponding displaying type of each default exposition for determining that described search result includes;
According to the displaying type, target translation model corresponding with each default exposition is obtained.
Optionally, if the corresponding displaying type of the default exposition is title class, the acquisition and each default exhibition
Show that the corresponding target translation model in part includes:Title translation model is obtained, the title translation model is according to title language
Material training obtains;
And/or
If the corresponding displaying type of the default exposition is abstract class, the acquisition and each default exposition phase
Corresponding target translation model includes:Abstract translation model is obtained, the abstract translation model is trained according to language material of making a summary
It arrives;
And/or
If the corresponding displaying type of the default exposition is content of pages class, the acquisition and each default displaying portion
The corresponding target translation model of split-phase includes:Content of pages translation model is obtained, the content translation model is according to preset page
Face content language material trains to obtain.
Optionally, described to utilize the target translation model if the default exposition is title division, obtain institute
The step of stating each default exposition corresponding translation search result of search result, including:
Identify the pre- set symbol that the title division is included;
According to the pre- set symbol, the title division is divided into multiple semantic primitives;
Each semantic primitive obtained to segmentation using the corresponding first object translation model of the title division is translated,
To obtain the corresponding translation result of each semantic primitive;
According to the pre- set symbol, the corresponding translation result of each semantic primitive is combined, to obtain the mark
Inscribe the corresponding first translation search result in part;The first translation search result includes the pre- set symbol.
Optionally, each semantic list that segmentation is obtained using the title division corresponding first object translation model
The step of member is translated, including:
Each semantic primitive and its corresponding context are input to the first object translation model respectively, it is described to obtain
The corresponding translation result of each semantic primitive of first object translation model output.
Optionally, described to utilize the target translation model if the default exposition is abstract part, obtain institute
The step of stating each default exposition corresponding translation search result of search result, including:
Object content of the extraction positioned at preset position from the abstract part;
Using the corresponding second target translation model of the preset position, the object content is translated, to obtain
Corresponding second translation search result.
Optionally, the method further includes:Determine the target category belonging to described search result;
It is described according to the displaying type, obtaining target translation model corresponding with each default exposition includes:
Target category in conjunction with belonging to described search result and the corresponding displaying type of each default exposition obtain each pre-
If the corresponding target translation model of exposition.
Optionally, the step of target category belonging to the determining described search result, including:
The content that described search result includes is matched with the dictionary of each preset classification respectively, to obtain each preset class
Not corresponding matching rate;
By the corresponding preset classification of the maximum in the corresponding matching rate of all preset classifications, as described search result institute
The target category of category.
Optionally, the step of target belonging to the determining described search result preset classification, including:
The content for including by search result inputs grader, and the classification results that the grader exports are searched as described in
Target category belonging to hitch fruit;Wherein, the grader obtains for the search result sample training according to each preset classification.
On the other hand, the invention discloses a kind of cross-language search devices, including:
Search term acquisition module, the search term for obtaining the first languages;
Search result acquisition module, for according to described search word, obtaining the search result of the second languages;
Search result processing module is handled for the search result to each second languages;
Described search result treatment module includes:It translation model determining module, translation search result acquisition module and turns over
Translate search result display module;
The translation model determining module determines and described search knot for the search result for each second languages
Each default corresponding target translation model of exposition of fruit;
The translation search result acquisition module obtains described search result for utilizing the target translation model
The corresponding translation search result of each default exposition;And
The translation search result display module, each default exposition pair for showing described search result to user
The translation search result answered.
Optionally, the translation model determining module includes:Show that type determination module and translation model obtain submodule
Block;
Wherein, the displaying type determination module, for determining each default exposition that described search result includes
Corresponding displaying type;
The translation model acquisition submodule, for according to the displaying type, obtaining opposite with each default exposition
The target translation model answered.
Optionally, if the corresponding displaying type of the default exposition is title class, the translation model obtains son
Module includes:First translation model acquiring unit;
The first translation model acquiring unit, for obtaining title translation model, the title translation model is foundation
Title language material trains to obtain;
And/or
If the corresponding displaying type of the default exposition is abstract class, the translation model acquisition submodule packet
It includes:Second translation model acquiring unit;
The second translation model acquiring unit, for obtaining abstract translation model, the abstract translation model is foundation
Abstract language material trains to obtain;
And/or
If the corresponding displaying type of the default exposition is content of pages class, the translation model acquisition submodule
Including:Third translation model acquiring unit;
The third translation model acquiring unit, for obtaining content of pages translation model, the content translation model is
It trains to obtain according to pre-set page content language material.
Optionally, if the default exposition is title division, the translation search result acquisition module includes:Know
Small pin for the case module, segmentation submodule, the first translation submodule and combination submodule;
Wherein, the identification submodule, the pre- set symbol that the title division is included for identification;
The segmentation submodule, for according to the pre- set symbol, the title division to be divided into multiple semantic primitives;
It is described first translation submodule, for using the corresponding first object translation model of the title division to dividing
To each semantic primitive translated, to obtain the corresponding translation result of each semantic primitive;
The combination submodule, for according to the pre- set symbol, to the corresponding translation result of each semantic primitive into
Row combination, to obtain the corresponding first translation search result of the title division;The first translation search result includes described
Pre- set symbol.
Optionally, the first translation submodule includes:Translation unit;
The translation unit is turned over for each semantic primitive and its corresponding context to be input to the first object respectively
Model is translated, to obtain the corresponding translation result of each semantic primitive of the first object translation model output.
Optionally, if the default exposition is abstract part, the translation search result acquisition module includes:It carries
Submodule and second is taken to translate submodule;
The extracting sub-module, for extracting the object content positioned at preset position from the abstract part;
The second translation submodule utilizes the corresponding second target translation model of the preset position, in the target
Appearance is translated, to obtain corresponding second translation search result.
Optionally, described device further includes:Category determination module;
The category determination module, for determining the target category belonging to described search result;
The translation model acquisition submodule includes:Model acquiring unit;
The model acquiring unit, for combining target category and each default exposition pair belonging to described search result
The displaying type answered obtains each default corresponding target translation model of exposition.
Optionally, the category determination module includes:Matched sub-block and determination sub-module;
The dictionary of the matched sub-block, content and each preset classification for including by described search result respectively carries out
Matching, to obtain the corresponding matching rate of each preset classification;
The determination sub-module, for by the corresponding preset class of the maximum in the corresponding matching rate of all preset classifications
Not, as the target category belonging to described search result.
Optionally, the category determination module includes:Classification submodule;
The classification submodule, the content for including by search result inputs grader, and the grader is exported
Classification results as the target category belonging to described search result;Wherein, the grader is searching according to each preset classification
Hitch fruit sample training obtains.
In another aspect, the invention discloses a kind of device for cross-language search, include memory and one or
The more than one program of person, one of them either more than one program be stored in memory and be configured to by one or
It includes the instruction for being operated below that more than one processor, which executes the one or more programs,:
Obtain the search term of the first languages;
According to described search word, the search result of the second languages is obtained;
For the search result of each second languages, following steps are executed:
Determine target translation model corresponding with each default exposition of described search result;
Using the target translation model, the corresponding translation search knot of each default exposition of described search result is obtained
Fruit;
The corresponding translation search result of each default exposition of described search result is shown to user.
The embodiment of the present invention includes following advantages:
The embodiment of the present invention can determine first in the translation process of the search result of the second languages of cross-language search
Target translation model corresponding with each default exposition of described search result, then utilizes above-mentioned target translation model,
Obtain the corresponding translation search result of default exposition of described search result;In this way, above-mentioned target translation model can be
Translation model compatible with each default exposition, that is, above-mentioned target translation model can be according to each default exposition
The characteristics of carry out translation of second languages to the first languages, therefore the accuracy of translation search result can be improved.
Description of the drawings
Fig. 1 is a kind of schematic diagram of the application environment of cross-language search method of the present invention;
Fig. 2 is a kind of step flow chart of cross-language search embodiment of the method one of the present invention;
Fig. 3 is a kind of structure diagram of cross-language search device embodiment of the present invention;
Fig. 4 be the present invention a kind of device 900 for cross-language search as terminal when block diagram;And
Fig. 5 be the present invention a kind of device for cross-language search as server when structural schematic diagram.
Specific implementation mode
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below in conjunction with the accompanying drawings and specific real
Applying mode, the present invention is described in further detail.
In the embodiment of the present invention, machine translation is regarded as the process of information transmission, with a kind of channel model to machine
Device translation explains.This thought thinks that the translation of source language sentence to target language sentence is a probability problem, any
One target language sentence be likely to be any one source language sentence translation, only probability is different, and machine translation is appointed
Business is exactly to find the sentence of maximum probability.Specific method is will to translate the decoding regarded as to original text by model conversion for translation
Journey.Therefore translation model can be divided into following problem:Model problem, training problem, decoding problem.So-called model problem,
Exactly it is translation model of the machine translation foundation for describing probability, that is, defines source language sentence to target language sentence
Translation probability computational methods.And training problem, it is all parameters that this model is obtained using corpus.So-called solution
Code problem is then, for the source language sentence of any one input, to go to search general on the basis of known translation model and parameter
The maximum translation of rate.
Inventor has found that existing scheme generally use general translator model is to mesh during implementing the embodiment of the present invention
The search result of poster speech is translated, which will be obtained identical as long as the content of text of input is identical
Translation search result.However, the search result for being typically different type usually has the characteristics that itself, in this way, using general translator
Model translates all types of search results, then is easy to influence the accuracy of translation search result, that is, existing scheme
In the obtained accuracy of translation search result it is relatively low.
The relatively low technical problem of accuracy for translation search result present in existing scheme, the embodiment of the present invention carry
Supply a kind of cross-language search scheme, the program that can obtain the search term of the first languages;According to described search word, second is obtained
The search result of languages;And for the search result of each second languages, determine the default exposition with described search result
Corresponding target translation model;And the target translation model is utilized, obtain the default exposition pair of described search result
The translation search result answered;And then the corresponding translation search result of default exposition of described search result is shown to user.
Since the embodiment of the present invention can determine that target corresponding with each default exposition of described search result translates mould first
Then type utilizes above-mentioned target translation model, obtain the corresponding translation search result of default exposition of described search result;
In this way, above-mentioned target translation model can be translation model compatible with each default exposition, that is, above-mentioned target is translated
Model can be according to carrying out translation of second languages to the first languages the characteristics of each default exposition, therefore can improve translation
The accuracy of search result.
In the embodiment of the present invention, the search term of the first languages can be translated as to the search term of the second languages first, then,
Search term according to the second languages is retrieved in the database of the second languages, to obtain the search result of the second languages.Cause
This, the search result of the second languages can be used for indicating the corresponding search result of search term of the second languages, and translation search result can
The translation search result for the first languages that search result for indicating according to the second languages is translated, wherein the second languages
Search result and the translation search results of the first languages can correspond to identical search result (such as webpage, video, picture, sound
It is happy etc.), one of difference of the two is the difference of linguistic form.
In a kind of application example of the present invention, if the search term of the first languages is " Donald Trump ", corresponding second languages
Search term be " Trump ", then can be retrieved in Database in English according to " Trump ", with obtain English search knot
Fruit, and using target translation model corresponding with each default exposition of described search result, to each default exposition
It is translated, to obtain corresponding translation search result.
The embodiment of the present invention can be applied to search APP, search website (such as search engine) has cross-language search work(
In the platform environment of energy, the search result from multilingual database can be not only provided a user, but also can be carried to user
For more accurate translation search as a result, not having object language reading ability or object language reading ability with satisfaction has
The demand of the user of limit.The embodiment of the present invention mainly for searching for APP to the cross-language search method of the embodiment of the present invention into
Row explanation, the corresponding cross-language search method of other platforms such as search website are cross-referenced.
Cross-language search method provided in an embodiment of the present invention can be applied in application environment shown in FIG. 1, such as Fig. 1 institutes
Show, client 100 and server 200 are located in wired or wireless network, by the wired or wireless network, client 100 and
Server 200 carries out data interaction.
The cross-language search flow of the embodiment of the present invention can be by any or combination in client 100 and server 200
It executes:
For example, client 100 can receive the search term of the first languages input by user, and being sent to server 200 should
The search term of first languages;Server 200 can obtain after the search term for receiving first languages according to described search word
The search result for taking the second languages determines each default exhibition with described search result for the search result of each second languages
Show the corresponding target translation model in part;And the target translation model is utilized, obtain each default exhibition of described search result
Show the corresponding translation search in part as a result, and sending the corresponding translation search of each default exposition as a result, making to client 100
Client 100 shows the corresponding translation search result of each default exposition of described search result to user.
Since the search result of the second languages and/or the acquisition process of translation search result are executed by server 200, therefore energy
The abundant advantage of 200 computing resource of server is enough played, the search result of the second languages and/or obtaining for translation search result are improved
It takes efficiency and obtains accuracy rate.For example, Cloud Server can be deployed with the computing device of numerous high configurations, therefore utilize these calculating
Equipment carry out the second languages search result and/or translation search result acquisition, with improve the second languages search result and/
Or the acquisition efficiency of translation search result and obtain accuracy rate;The calculation resources of 100 side of client can be saved simultaneously, improve visitor
The performance of intelligent terminal corresponding to family end 100.
Certainly, the search result of the second languages and/or the acquisition process of translation search result can also be held by client 100
Row, the embodiment of the present invention execute master for the search result of the second languages and/or the specific of the acquisition process of translation search result
Body does not limit.
Optionally, client 100 may operate on intelligent terminal, and above-mentioned intelligent terminal specifically includes but unlimited:Intelligence
Mobile phone, tablet computer, E-book reader, MP3 (dynamic image expert's compression standard audio level 3, Moving Picture
Experts Group Audio Layer III) player, MP4 (dynamic image expert's compression standard audio levels 4, Moving
Picture Experts Group Audio Layer IV) player, pocket computer on knee, vehicle-mounted computer, desk-top meter
Calculation machine, set-top box, intelligent TV set, wearable device etc..
Embodiment of the method one
With reference to Fig. 2, a kind of step flow chart of cross-language search embodiment of the method one of the present invention is shown, it specifically can be with
Include the following steps:
Step 201, the search term for obtaining the first languages;
Step 202, according to described search word, obtain the search result of the second languages;
For the search result of each second languages, following steps are executed:
Step 203 determines target translation model corresponding with each default exposition of described search result;
Step 204, using the target translation model, each default exposition for obtaining described search result corresponding turns over
Translate search result;
Step 205, the corresponding translation search result of each default exposition that described search result is shown to user.
In the embodiment of the present invention, the search term of the first languages can be that user is inputted using the first languages.In practical application
In, UI (user interface, UserInterface) can be provided by searching for the client of APP or search website, then user can lead to
Cross the search term that the modes such as search box, the speech interface on the UI submit the first languages to client.No matter which kind of user passes through
Mode to client submit the first languages search term, client can include by the search term of the first languages received
Search box.Therefore, in the embodiment of the present invention, the search term of the first languages input by user may include:User passes through arbitrary side
The search term for the first languages that formula is submitted to client.It is appreciated that search term of the embodiment of the present invention for first languages
Specific acquisition modes do not limit.
In the embodiment of the present invention, the first languages and the second languages can be used for indicating different bilinguals, above-mentioned first language
Kind and the second languages can be obtained by user is preset, also can pass through the search behavior of analysis user by searching for APP or search website
And/or navigation patterns obtain.Optionally, it searches for APP or search website can be using the most common language of user as the first language
Kind, will in addition to the first languages used language as the second languages.For example, before the search behavior of user shows user
The search term used is Chinese search word, then can determine that original language is Chinese;The navigation patterns of user are also shown that user visits
It asked translation web site, and the mutual translational action between Chinese and English was carried out by the translation web site, therefore can determine the
Two languages are English.It is appreciated that the quantity of the second languages of the embodiment of the present invention can be one or more, for example, right
For using Chinese as the user of mother tongue, the first languages can be Chinese, and the second languages can be English, Japanese, Korean, moral
Text, one kind in French or combination.The embodiment of the present invention is mainly right by taking the first languages are Chinese, the second languages are English as an example
The cross-language search method of the embodiment of the present invention illustrates, other first languages and the corresponding cross-language search side of the second languages
Method is cross-referenced.
In practical applications, the search term of the first languages can be translated as second by step 202 by client or server
The search term of languages, then, the search term according to the second languages are retrieved in the database of the second languages, to obtain second
The search result of languages.By taking the second languages are English as an example, the data of American-European website can be stored in English database.It can
To understand, the embodiment of the present invention is for according to described search word, obtaining the specific acquisition modes of the search result of the second languages not
It limits.
Optionally, it during the search term of the first languages to be translated as to the search term of the second languages, may obtain
A variety of different translation results can select the highest one kind of confidence level from a variety of different translation results in such cases
Translation result.Further, it is possible to obtain the search result of the second languages according to the highest translation result search of the confidence level;Also may be used
To be scanned for respectively according to a variety of different one or more of translation results, and the result conduct that search is obtained
The search result of second languages.In a kind of application example of the present invention, if the search term of the first languages is " Donald Trump ", the
The search term of two languages can be " Trump ".
Step 203 can be directed to each search result that step 202 obtains, and determine each default exhibition with described search result
Show the corresponding target translation model in part;Step 204 can utilize the obtained target translation model of step 203, described in acquisition
The corresponding translation search result of each default exposition of search result.In this way, above-mentioned target translation model can be with it is each pre-
If the compatible translation model of exposition, that is, the characteristics of above-mentioned target translation model can be according to each default exposition
Translation of second languages to the first languages is carried out, therefore the accuracy of translation search result can be improved.
In a kind of alternative embodiment of the present invention, above-mentioned determination is opposite with each default exposition of described search result
The step of target translation model answered, may include:The corresponding exhibition of each default exposition for determining that described search result includes
Show type;According to the displaying type, target translation model corresponding with each default exposition is obtained.Above-mentioned displaying type
The characteristics of can reflecting default exposition, therefore according to the displaying type of each default exposition, it can obtain and each default exhibition
Show the target translation model of body fit, and then the accuracy of translation search result can be improved.
In the embodiment of the present invention, above-mentioned default exposition can be used for indicating being directed to the preset displaying content of search result,
The embodiment of the present invention can be directed to the search result default exposition that includes and its corresponding displaying type, provide acquisition with it is each
The following acquisition scheme of the corresponding target translation model of default exposition:
Acquisition scheme 1,
In acquisition scheme 1, the default exposition may include:Title division;The corresponding displaying of the title division
Type may include:Title class;Then the corresponding target translation model of the title division may include:Title translation model;Institute
It can be to train to obtain according to title language material to state title translation model.
For the title division of search result, usually has the characteristics that itself:Such as be usually expressed as short sentence, phrase or
The form of phrase or special pre- set symbol "-", " | ", " ... " etc., therefore the embodiment of the present invention can be advance would generally be contained
The title language material from search result is obtained, optionally, which can be that bilingual corpora or alignment language material (also will
It can be matched with the word of intertranslation in double-language sentence);Then it trains to obtain title translation model according to title language material.Due to title
Language material also has the characteristics of title division, therefore the title translation model trained according to title language material is it can be considered that short sentence, short
The form of language or phrase, the features such as including pre- set symbol, therefore more accurate translation search can be obtained for title division
As a result.
Acquisition scheme 2,
In acquisition scheme 2, the default exposition may include:Abstract part;The corresponding displaying in the abstract part
Type may include:Abstract class;Then the corresponding target translation model in the abstract part may include:Abstract translation model;Institute
It can be to train to obtain according to abstract language material to state abstract translation model.
For the abstract part of search result, usually has the characteristics that itself:Such as be usually expressed as long sentence form or
There is certain types of content in specific position and (will appear relatively-stationary content in the beginning location of abstract, such as time, information in person
Source etc.) etc., therefore the embodiment of the present invention can obtain the abstract language material from search result in advance, optionally, which can
Think bilingual corpora or alignment language material;Then it trains to obtain abstract translation model according to abstract language material.Due to language material of making a summary
Have the characteristics of abstract part, according to the abstract translation model trained of abstract language material it can be considered that the form of long sentence or
There is the characteristics of certain types of content in specific position, therefore more accurate translation search can be obtained for abstract part and imitated
Fruit.
Acquisition scheme 3,
In acquisition scheme 3, the default exposition may include:Content of pages part;The content of pages part pair
The displaying type answered may include:Content of pages class;Then the corresponding target translation model in the content of pages part may include:
Content of pages translation model;The content translation model is to train to obtain according to pre-set page content language material.
Other than title division and abstract part, content of pages part can be also arranged in certain websites in search result,
So that user obtains the more accurate information of the website by the content of pages part.For example, e-commerce website can be
Content of pages part is set in search result, which can be used for showing popularization activity, to pass through the popularization activity
Attract the eyeball of user.For another example, content of pages part can be arranged in news website in search result, which can
For showing hot news event, to attract the eyeball of user by the hot news event.
The content of pages part of website setting is generally configured with the feature related to own website, such as the page of e-commerce website
Face content is usually related with commodity, and the content of pages of news website is usually related with news.Therefore the embodiment of the present invention can be advance
Pre-set page content language material is obtained, pre-set page content language material here is the language material from search result;Optionally, this is preset
Content of pages can be bilingual corpora or alignment language material;Then it trains to obtain content of pages according to pre-set page content language material and turn over
Translate model.Since pre-set page content language material also has the characteristics of content of pages part, train to obtain according to pre-set page content
Content of pages translation model it can be considered that the characteristics of content of pages part, therefore content of pages part can be obtained more smart
True translation search result.
Above by obtaining scheme 1 to obtaining scheme 3 to obtaining target translation mould corresponding with each default exposition
The process of type is described in detail, it will be understood that those skilled in the art can use acquisition side according to practical application request
Case 1 is to any or several combination obtained in scheme 3, alternatively, can also be directed to other default expositions uses it
He obtains scheme, and the embodiment of the present invention is for obtaining the detailed process of target translation model corresponding with default exposition not
It limits.
It, can also be according to the target category belonging to search result, to acquisition side in a kind of alternative embodiment of the present invention
Case 1 to 3 corresponding title translation model of acquisition scheme, abstract translation model and content of pages translation model optimize, with into
One step improves the accuracy of translation search result.Correspondingly, the method can also include:It determines belonging to described search result
Target category.It is then aforementioned described according to the corresponding displaying type of each default exposition, it obtains and each default exposition
Corresponding target translation model may include:Target category in conjunction with belonging to described search result and each default exposition pair
The displaying type answered obtains each default corresponding target translation model of exposition.
Specifically, the target category belonging to the combination described search result and the corresponding displaying class of each default exposition
Type, obtaining each default corresponding target translation model of exposition may include:If the default exposition is title portion
Point, corresponding displaying type is title class, then the corresponding title translation model of the title division may include:The target
The corresponding title translation model of classification;Wherein, the corresponding title translation model of the target category is according to the target category
Interior title language material trains to obtain;
And/or
If the default exposition is abstract part, corresponding displaying type is abstract class, then the abstract part
Corresponding abstract translation model may include:The corresponding abstract translation model of the target category;Wherein, the target category pair
The abstract translation model answered is to train to obtain according to the abstract language material in the target category;
And/or
If the default exposition is content of pages part, corresponding displaying type is content of pages class, then described
The corresponding page translation model in content of pages part may include:The corresponding page translation model of the target category;Wherein, institute
It is to train to obtain according to the pre-set page content language material in the target category to state the corresponding page translation model of target category.
Optionally, above-mentioned target category may include:E-commerce, forum, news, novel, video etc., then can basis
The search result of each target category collects title language material, abstract language material and pre-set page content language material in target category.
In a kind of alternative embodiment of the present invention, the step of target category belonging to above-mentioned determining described search result,
May include:The content that described search result includes is matched with the dictionary of each preset classification respectively, it is each preset to obtain
The corresponding matching rate of classification;By the corresponding preset classification of the maximum in the corresponding matching rate of all preset classifications, as described
Target category belonging to search result.Wherein, the content that described search result includes can be that described search result corresponds to webpage
Including content (namely web page contents), can also be the content that the default exposition of described search result includes.
Optionally, the process for obtaining matching rate may include:The default exposition that includes by described search result and/or
Web page contents are segmented, the quantity M of vocabulary for counting the quantity N of all vocabulary and occurring in the dictionary of preset classification,
Using the ratio of M and N as matching rate, it will be understood that the embodiment of the present invention does not limit the specific acquisition modes of matching rate
System.
In a kind of alternative embodiment of the present invention, the step of target category belonging to above-mentioned determining described search result,
May include:The content for including by search result inputs grader, and the classification results that the grader is exported are as described in
Target category belonging to search result;Wherein, the grader obtains for the search result sample training according to each preset classification.
Above-mentioned grader can be used for differentiating which preset classification search result belongs to, that is, the result of grader output namely search knot
Target category belonging to fruit.
It should be noted that can be by the training method of machine learning, training obtains the various of the embodiment of the present invention and turns over
Translate model or grader.In addition, the embodiment of the present invention is not added with the concrete type of various translation models or grader
With limitation, for example, the type of translation model may include:NMT (translate, Neural Machine by neural network machine
Translation), statistical machine translation (SMT, Statistical Machine Translation);Alternatively, grader
Concrete type may include:SVM (support vector machines, Support Vector Machine), Bayes etc..
Step 204 can utilize the target translation model that step 203 obtains, and obtain the default displaying portion of described search result
Divide corresponding translation search result.The present invention a kind of alternative embodiment in, can also according to preset exposition the characteristics of,
Preset corresponding translation rule, and translation model is intelligently utilized according to the translation rule, it is searched with obtaining more accurately translating
Hitch fruit.
The embodiment of the present invention can be provided using target translation model, obtain each default exposition of described search result
The following translation scheme of corresponding translation search result:
Translation scheme 1,
In translation scheme 1, the default exposition may include:Title division, then it is described to be translated using the target
Model the step of obtaining each default exposition corresponding translation search result of described search result, may include:
Identify the pre- set symbol that the title division is included;
According to the pre- set symbol, the title division is divided into multiple semantic primitives;
Each semantic primitive obtained to segmentation using the corresponding first object translation model of the title division is translated,
To obtain the corresponding translation result of each semantic primitive;
According to the pre- set symbol, the corresponding translation result of each semantic primitive is combined, to obtain the mark
Inscribe the corresponding first translation search result in part;The first translation search result includes the pre- set symbol.
Wherein, the corresponding first object translation model of above-mentioned title division can be title translation model above-mentioned, also may be used
Think other corresponding translation models of title division.Upper meaning elements can be in character, word, phrase, phrase or short sentence
It is any etc..
In practical applications, title division would generally contain special pre- set symbol "-", " | ", " ... " etc., then of the invention
Embodiment can be directed to the pre- set symbol of title division, preset corresponding translation rule, and utilize the translation rule intelligence land productivity
With translation model, to obtain more accurate translation search result.Specifically, corresponding first mesh of the title division is being utilized
During mark translation model is translated, the semantic primitive on these pre- set symbol both sides is separately translated, then by each section
The corresponding translation result of semantic primitive is combined, and retain in obtained the first translation search result of combination pre- set symbol,
And the relative position between the semantic primitive on the pre- set symbol both sides, therefore title division corresponding first can be improved
The accuracy of translation search result.
In a kind of alternative embodiment of the present invention, in order to avoid phrase or sentence are become broken by separated translation, on
It states and is utilized respectively the step of corresponding target translation model of the title division translates each section semantic primitive, can wrap
It includes:Each semantic primitive and its corresponding context are input to the first object translation model respectively, to obtain described first
The corresponding translation result of each semantic primitive of target translation model output.Due to the process in separated translation each section semantic primitive
In consider corresponding context relation, therefore can ensure the globality of the first translation search result and of overall importance.
Translation scheme 2,
In translation scheme 2, the default exposition may include:Abstract part, then it is described to be translated using the target
Model the step of obtaining each default exposition corresponding translation search result of described search result, may include:
Object content of the extraction positioned at preset position from the abstract part;
Using the corresponding second target translation model of the preset position, the object content is translated, to obtain
Corresponding second translation search result.
The embodiment of the present invention finds the following features of abstract part:There is certain types of content in specific position.For example,
The beginning location of abstract will appear relatively-stationary content, such as time, information source.Provide herein abstract part as shown below
Example:
Example 1,44 is replied-is posted the time:On April 15th, 2014
- MOSCOW, Jan.11 (Xinhua) before example 2,28 minutes -- The Kremlin on Wednesday denied
that it has compromising materials on U.S.President-elect Donald Trump
Wherein, example 1 is the abstract part of the search result of forum's classification, " 44 times occurred in beginning location
Again ", it " posts the time:It is respectively used on April 15th, 2014 " indicate the reply quantity of model type search result, post the time,
The characteristics of abstract part for the search result that this reply quantity, time of posting belong to forum's classification.
Example 2 be news category search result abstract part, beginning location occur " before 28 minutes ",
" MOSCOW, Jan.11 (Xinhua) " is respectively used to indicate the difference of the issuing time and current time of news type search result
The difference of value, the issue date of news type search result and information source, issuing time and current time, news type search are tied
The issue date of fruit and information source belong to the characteristics of abstract part of the search result of news category.
It is appreciated that above-mentioned example 1 and example 2 are showing for the search result of forum's classification and the search result of news category
Example, in fact, the abstract part of the search result of other classifications also has:There is the spy of certain types of content in specific position
Point.Therefore the embodiment of the present invention can utilize the feature, and corresponding second target translation model is trained for preset position, in this way,
In translation process, the object content positioned at preset position can be extracted from the abstract part;Utilize the preset position
Corresponding second target translation model, translates the object content, to obtain corresponding second translation search result.Its
In, above-mentioned second target translation model can be the corresponding preset language material of preset position train to obtain, can with it is preset
The characteristics of corresponding preset language material in position, is adapted, therefore the object content positioned at preset position can be obtained more smart
True translation search result.
It should be noted that the second target translation model of the embodiment of the present invention can be:The target category and preset
The corresponding translation model in position, in this way, second can be carried out according to the corresponding preset language material of preset position in target category
The training of target translation model.
The target translation model is utilized in step 204, each default exposition for obtaining described search result is corresponding
After translation search result, step 205 can show that corresponding translate of each default exposition of described search result is searched to user
Hitch fruit, wherein client can be by one or more corresponding translation search of default exposition of described search result
As a result it is shown.
To sum up, the cross-language search method of the embodiment of the present invention, can be in the second languages search result of cross-language search
Translation process in, it is first determined target translation model corresponding with each default exposition of described search result, then
Using above-mentioned target translation model, the corresponding translation search result of default exposition of described search result is obtained;On in this way,
It can be translation model compatible with each default exposition to state target translation model, that is, above-mentioned target translation model can
With according to carrying out translation of second languages to the first languages the characteristics of each default exposition, therefore translation search knot can be improved
The accuracy of fruit.
It should be noted that for embodiment of the method, for simple description, therefore it is dynamic to be all expressed as a series of movement
It combines, but those skilled in the art should understand that, the embodiment of the present invention is not limited by described athletic performance sequence
System, because of embodiment according to the present invention, certain steps can be performed in other orders or simultaneously.Secondly, art technology
Personnel should also know that embodiment described in this description belongs to preferred embodiment, and involved athletic performance simultaneously differs
Surely it is necessary to the embodiment of the present invention.
Device embodiment
With reference to Fig. 3, shows a kind of structure diagram of cross-language search device embodiment of the present invention, can specifically wrap
It includes:Search term acquisition module 301, search result acquisition module 302, search result processing module 303;
Above-mentioned search term acquisition module 301, the search term for obtaining the first languages;
Mentioned above searching results acquisition module 302, for according to described search word, obtaining the search result of the second languages;
Mentioned above searching results processing module 303 is handled for the search result to each second languages;
Wherein, described search result treatment module 303 may include:Translation model determining module 3031, translation search knot
Fruit acquisition module 3032 and translation search result display module 3033;
Wherein, above-mentioned translation model determining module 3031, for determining each default exposition with described search result
Corresponding target translation model;
Above-mentioned translation search result acquisition module 3032 obtains described search knot for utilizing the target translation model
The corresponding translation search result of each default exposition of fruit;And
Above-mentioned translation search result display module 3033, each default displaying portion for showing described search result to user
Divide corresponding translation search result.
Optionally, the translation model determining module 3031 may include:Show type determination module and translation model
Acquisition submodule;
Wherein, the displaying type determination module, for determining each default exposition that described search result includes
Corresponding displaying type;
The translation model acquisition submodule, for according to the displaying type, obtaining opposite with each default exposition
The target translation model answered.
Optionally, if the corresponding displaying type of the default exposition is title class, the translation model obtains son
Module may include:First translation model acquiring unit;
The first translation model acquiring unit, for obtaining title translation model, the title translation model is foundation
Title language material trains to obtain;
And/or
If the corresponding displaying type of the default exposition is abstract class, the translation model acquisition submodule can be with
Including:Second translation model acquiring unit;
The second translation model acquiring unit, for obtaining abstract translation model, the abstract translation model is foundation
Abstract language material trains to obtain;
And/or
If the corresponding displaying type of the default exposition is content of pages class, the translation model acquisition submodule
May include:Third translation model acquiring unit;
The third translation model acquiring unit, for obtaining content of pages translation model, the content translation model is
It trains to obtain according to pre-set page content language material.
Optionally, if the default exposition is title division, the translation search result acquisition module 3032 can
To include:Identify submodule, segmentation submodule, the first translation submodule and combination submodule;
Wherein, the identification submodule, the pre- set symbol that the title division is included for identification;
The segmentation submodule, for according to the pre- set symbol, the title division to be divided into multiple semantic primitives;
It is described first translation submodule, for using the corresponding first object translation model of the title division to dividing
To each semantic primitive translated, to obtain the corresponding translation result of each semantic primitive;
The combination submodule, for according to the pre- set symbol, to the corresponding translation result of each semantic primitive into
Row combination, to obtain the corresponding first translation search result of the title division;The first translation search result may include
The pre- set symbol.
Optionally, the first translation submodule may include:Translation unit;
The translation unit is turned over for each semantic primitive and its corresponding context to be input to the first object respectively
Model is translated, to obtain the corresponding translation result of each semantic primitive of the first object translation model output.
Optionally, if the default exposition is abstract part, the translation search result acquisition module 3032 can
To include:Extracting sub-module and the second translation submodule;
Wherein, the extracting sub-module, for extracting the object content positioned at preset position from the abstract part;
The second translation submodule utilizes the corresponding second target translation model of the preset position, in the target
Appearance is translated, to obtain corresponding second translation search result.
Optionally, described device can also include:Category determination module;
The category determination module, for determining the target category belonging to described search result;
The translation model acquisition submodule may include:Model acquiring unit;
The model acquiring unit, for combining target category and each default exposition pair belonging to described search result
The displaying type answered obtains each default corresponding target translation model of exposition.
Optionally, the category determination module may include:Matched sub-block and determination sub-module;
Wherein, the matched sub-block, the content for may include by described search result respectively and each preset classification
Dictionary matched, to obtain the corresponding matching rate of each preset classification;
The determination sub-module, for by the corresponding preset class of the maximum in the corresponding matching rate of all preset classifications
Not, as the target category belonging to described search result.
Optionally, the category determination module may include:Classification submodule;
The classification submodule, content for may include by search result input grader, and by the grader
The classification results of output are as the target category belonging to described search result;Wherein, the grader is according to each preset classification
Search result sample training obtain.
For device embodiments, since it is basically similar to the method embodiment, so fairly simple, the correlation of description
Place illustrates referring to the part of embodiment of the method.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with
The difference of other embodiment, the same or similar parts between the embodiments can be referred to each other.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 4 be a kind of device 900 for cross-language search shown according to an exemplary embodiment as terminal when
Block diagram.For example, device 900 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console,
Tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
With reference to Fig. 4, device 900 may include following one or more components:Processing component 902, memory 904, power supply
Component 906, multimedia component 908, audio component 910, the interface 912 of input/output (I/O), sensor module 914, and
Communication component 916.
The integrated operation of 902 usual control device 900 of processing component, such as with display, call, data communication, phase
Machine operates and record operates associated operation.Processing element 902 may include that one or more processors 920 refer to execute
It enables, to perform all or part of the steps of the methods described above.In addition, processing component 902 may include one or more modules, just
Interaction between processing component 902 and other assemblies.For example, processing component 902 may include multi-media module, it is more to facilitate
Interaction between media component 908 and processing component 902.
Memory 904 is configured as storing various types of data to support the operation in equipment 900.These data are shown
Example includes instruction for any application program or method that are operated on device 900, contact data, and telephone book data disappears
Breath, picture, video etc..Memory 904 can be by any kind of volatibility or non-volatile memory device or their group
It closes and realizes, such as static RAM (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile
Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 906 provides electric power for the various assemblies of device 900.Power supply module 906 may include power management system
System, one or more power supplys and other generated with for device 900, management and the associated component of distribution electric power.
Multimedia component 908 is included in the screen of one output interface of offer between described device 900 and user.One
In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings
Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding motion
The boundary of action, but also detect duration and pressure associated with the touch or slide operation.In some embodiments,
Multimedia component 908 includes a front camera and/or rear camera.When equipment 900 is in operation mode, mould is such as shot
When formula or video mode, front camera and/or rear camera can receive external multi-medium data.Each preposition camera shooting
Head and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 910 is configured as output and/or input audio signal.For example, audio component 910 includes a Mike
Wind (MIC), when device 900 is in operation mode, when such as call model, logging mode and speech recognition mode, microphone by with
It is set to reception external audio signal.The received audio signal can be further stored in memory 904 or via communication set
Part 916 is sent.In some embodiments, audio component 910 further includes a loud speaker, is used for exports audio signal.
I/O interfaces 912 provide interface between processing component 902 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock
Determine button.
Sensor module 914 includes one or more sensors, and the state for providing various aspects for device 900 is commented
Estimate.For example, sensor module 914 can detect the state that opens/closes of equipment 900, and the relative positioning of component, for example, it is described
Component is the display and keypad of device 900, and sensor module 914 can be with 900 1 components of detection device 900 or device
Position change, the existence or non-existence that user contacts with device 900,900 orientation of device or acceleration/deceleration and device 900
Temperature change.Sensor module 914 may include proximity sensor, be configured to detect without any physical contact
Presence of nearby objects.Sensor module 914 can also include optical sensor, such as CMOS or ccd image sensor, at
As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 916 is configured to facilitate the communication of wired or wireless way between device 900 and other equipment.Device
900 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or combination thereof.In an exemplary implementation
In example, communication component 916 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 916 further includes near-field communication (NFC) module, to promote short range communication.Example
Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 900 can be believed by one or more application application-specific integrated circuit (ASIC), number
Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, it includes the non-transitorycomputer readable storage medium instructed, example to additionally provide a kind of
Such as include the memory 904 of instruction, above-metioned instruction can be executed by the processor 920 of device 900 to complete the above method.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is held by the processor of terminal
When row so that terminal is able to carry out a kind of cross-language search method, the method includes:Obtain the search term of the first languages;Root
According to described search word, the search result of the second languages is obtained;For the search result of each second languages, following steps are executed:
Determine target translation model corresponding with each default exposition of described search result;Using the target translation model,
Obtain the corresponding translation search result of each default exposition of described search result;The each of described search result is shown to user
The corresponding translation search result of default exposition.
Optionally, the corresponding target translation model of each default exposition of the determination and described search result, packet
It includes:
The corresponding displaying type of each default exposition for determining that described search result includes;
According to the displaying type, target translation model corresponding with each default exposition is obtained.
Optionally, if the corresponding displaying type of the default exposition is title class, the acquisition and each default exhibition
Show that the corresponding target translation model in part includes:Title translation model is obtained, the title translation model is according to title language
Material training obtains;
And/or
If the corresponding displaying type of the default exposition is abstract class, the acquisition and each default exposition phase
Corresponding target translation model includes:Abstract translation model is obtained, the abstract translation model is trained according to language material of making a summary
It arrives;
And/or
If the corresponding displaying type of the default exposition is content of pages class, the acquisition and each default displaying portion
The corresponding target translation model of split-phase includes:Content of pages translation model is obtained, the content translation model is according to preset page
Face content language material trains to obtain.
Optionally, described to utilize the target translation model if the default exposition is title division, obtain institute
The corresponding translation search of each default exposition of search result is stated as a result, including:
Identify the pre- set symbol that the title division is included;
According to the pre- set symbol, the title division is divided into multiple semantic primitives;
Each semantic primitive obtained to segmentation using the corresponding first object translation model of the title division is translated,
To obtain the corresponding translation result of each semantic primitive;
According to the pre- set symbol, the corresponding translation result of each semantic primitive is combined, to obtain the mark
Inscribe the corresponding first translation search result in part;The first translation search result includes the pre- set symbol.
Optionally, each semantic list that segmentation is obtained using the title division corresponding first object translation model
Member is translated, including:
Each semantic primitive and its corresponding context are input to the first object translation model respectively, it is described to obtain
The corresponding translation result of each semantic primitive of first object translation model output.
Optionally, described to utilize the target translation model if the default exposition is abstract part, obtain institute
The corresponding translation search of each default exposition of search result is stated as a result, including:
Object content of the extraction positioned at preset position from the abstract part;
Using the corresponding second target translation model of the preset position, the object content is translated, to obtain
Corresponding second translation search result.
Optionally, the terminal be also configured to by one either more than one processor execute it is one or one
Procedure above includes the instruction for being operated below:
Determine the target category belonging to described search result;
It is described according to the displaying type, obtaining target translation model corresponding with each default exposition includes:
Target category in conjunction with belonging to described search result and the corresponding displaying type of each default exposition obtain each pre-
If the corresponding target translation model of exposition.
Optionally, the target category belonging to the determining described search result, including:
The content that described search result includes is matched with the dictionary of each preset classification respectively, to obtain each preset class
Not corresponding matching rate;
By the corresponding preset classification of the maximum in the corresponding matching rate of all preset classifications, as described search result institute
The target category of category.
Optionally, the preset classification of target belonging to the determining described search result, including:
The content for including by search result inputs grader, and the classification results that the grader exports are searched as described in
Target category belonging to hitch fruit;Wherein, the grader obtains for the search result sample training according to each preset classification.
Fig. 5 be a kind of device for cross-language search shown according to an exemplary embodiment as server when frame
Figure.The server 1900 can generate bigger difference because configuration or performance are different, may include in one or more
Central processor (central processing units, CPU) 1922 (for example, one or more processors) and memory
1932, one or more storage application programs 1942 or data 1944 storage medium 1930 (such as one or one with
Upper mass memory unit).Wherein, memory 1932 and storage medium 1930 can be of short duration storage or persistent storage.It is stored in
The program of storage medium 1930 may include one or more modules (diagram does not mark), and each module may include to clothes
The series of instructions operation being engaged in device.Further, central processing unit 1922 could be provided as communicating with storage medium 1930,
The series of instructions operation in storage medium 1930 is executed on server 1900.
Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets
Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956, and/or, one or
More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM
Etc..
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the present invention
Its embodiment.The present invention is directed to cover the present invention any variations, uses, or adaptations, these modifications, purposes or
Person's adaptive change follows the general principle of the present invention and includes the undocumented common knowledge in the art of the disclosure
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the invention is not limited in the precision architectures for being described above and being shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.
Above to a kind of cross-language search method provided by the present invention, a kind of cross-language search device and it is a kind of for across
The device of language search, is described in detail, specific case used herein to the principle of the present invention and embodiment into
Elaboration is gone, the explanation of above example is only intended to facilitate the understanding of the method and its core concept of the invention;Meanwhile for this
The those skilled in the art in field, according to the thought of the present invention, there will be changes in the specific implementation manner and application range,
In conclusion the content of the present specification should not be construed as limiting the invention.
Claims (11)
1. a kind of cross-language search method, which is characterized in that including:
Obtain the search term of the first languages;
According to described search word, the search result of the second languages is obtained;
For the search result of each second languages, following steps are executed:
Determine target translation model corresponding with each default exposition of described search result;
Using the target translation model, the corresponding translation search result of each default exposition of described search result is obtained;
The corresponding translation search result of each default exposition of described search result is shown to user.
2. according to the method described in claim 1, it is characterized in that, each default displaying portion of the determination and described search result
The step of split-phase corresponding target translation model, including:
The corresponding displaying type of each default exposition for determining that described search result includes;
According to the displaying type, target translation model corresponding with each default exposition is obtained.
3. according to the method described in claim 2, it is characterized in that,
If the corresponding displaying type of the default exposition is title class, it is described obtain it is corresponding with each default exposition
Target translation model include:Title translation model is obtained, the title translation model is to train to obtain according to title language material;
And/or
If the corresponding displaying type of the default exposition is abstract class, it is described obtain it is corresponding with each default exposition
Target translation model include:Abstract translation model is obtained, the abstract translation model is to train to obtain according to abstract language material;
And/or
If the corresponding displaying type of the default exposition is content of pages class, the acquisition and each default exposition phase
Corresponding target translation model includes:Content of pages translation model is obtained, the content translation model is according in pre-set page
Hold language material to train to obtain.
4. according to any method in claims 1 to 3, which is characterized in that if the default exposition is title portion
Point, then it is described to utilize the target translation model, obtain the corresponding translation search of each default exposition of described search result
As a result the step of, including:
Identify the pre- set symbol that the title division is included;
According to the pre- set symbol, the title division is divided into multiple semantic primitives;
Each semantic primitive obtained to segmentation using the corresponding first object translation model of the title division is translated, with
To the corresponding translation result of each semantic primitive;
According to the pre- set symbol, the corresponding translation result of each semantic primitive is combined, to obtain the title portion
Divide corresponding first translation search result;The first translation search result includes the pre- set symbol.
5. according to the method described in claim 4, it is characterized in that, described turned over using the corresponding first object of the title division
The step of each semantic primitive that model obtains segmentation is translated is translated, including:
Each semantic primitive and its corresponding context are input to the first object translation model respectively, to obtain described first
The corresponding translation result of each semantic primitive of target translation model output.
6. according to any method in claims 1 to 3, which is characterized in that if the default exposition is abstract portion
Point, then it is described to utilize the target translation model, obtain the corresponding translation search of each default exposition of described search result
As a result the step of, including:
Object content of the extraction positioned at preset position from the abstract part;
Using the corresponding second target translation model of the preset position, the object content is translated, to be corresponded to
The second translation search result.
7. according to the method in claim 2 or 3, which is characterized in that the method further includes:Determine described search result institute
The target category of category;
It is described according to the displaying type, obtaining target translation model corresponding with each default exposition includes:
Target category in conjunction with belonging to described search result and the corresponding displaying type of each default exposition obtain each default exhibition
Show the corresponding target translation model in part.
8. the method according to the description of claim 7 is characterized in that target category belonging to the determining described search result
Step, including:
The content that described search result includes is matched with the dictionary of each preset classification respectively, to obtain each preset classification pair
The matching rate answered;
By the corresponding preset classification of the maximum in the corresponding matching rate of all preset classifications, belonging to described search result
Target category.
9. the method according to the description of claim 7 is characterized in that the preset class of target belonging to the determining described search result
Other step, including:
The content for including by search result inputs grader, and the classification results that the grader is exported are as described search knot
Target category belonging to fruit;Wherein, the grader obtains for the search result sample training according to each preset classification.
10. a kind of cross-language search device, which is characterized in that including:
Search term acquisition module, the search term for obtaining the first languages;
Search result acquisition module, for according to described search word, obtaining the search result of the second languages;
Search result processing module is handled for the search result to each second languages;
Described search result treatment module includes:
Translation model determining module, for determining target translation mould corresponding with each default exposition of described search result
Type;
Translation search result acquisition module obtains each default exhibition of described search result for utilizing the target translation model
Show the corresponding translation search result in part;And
Translation search result display module, the corresponding translation of each default exposition for showing described search result to user
Search result.
11. a kind of device for cross-language search, which is characterized in that include memory and one or more than one
Program, one of them either more than one program be stored in memory and be configured to by one or more than one
It includes the instruction for being operated below to manage device and execute the one or more programs:
Obtain the search term of the first languages;
According to described search word, the search result of the second languages is obtained;
For the search result of each second languages, following steps are executed:
Determine target translation model corresponding with each default exposition of described search result;
Using the target translation model, the corresponding translation search result of each default exposition of described search result is obtained;
The corresponding translation search result of each default exposition of described search result is shown to user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710025472.6A CN108304412B (en) | 2017-01-13 | 2017-01-13 | Cross-language search method and device for cross-language search |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710025472.6A CN108304412B (en) | 2017-01-13 | 2017-01-13 | Cross-language search method and device for cross-language search |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108304412A true CN108304412A (en) | 2018-07-20 |
CN108304412B CN108304412B (en) | 2022-09-30 |
Family
ID=62872442
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710025472.6A Active CN108304412B (en) | 2017-01-13 | 2017-01-13 | Cross-language search method and device for cross-language search |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108304412B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334526A (en) * | 2017-01-20 | 2018-07-27 | 北京搜狗科技发展有限公司 | The methods of exhibiting and device of search result items |
WO2019109663A1 (en) * | 2017-12-08 | 2019-06-13 | 北京搜狗科技发展有限公司 | Cross-language search method and apparatus, and apparatus for cross-language search |
CN110930208A (en) * | 2018-09-19 | 2020-03-27 | 阿里巴巴集团控股有限公司 | Object searching method and device |
CN111368117A (en) * | 2018-12-26 | 2020-07-03 | 财团法人工业技术研究院 | Cross-language information constructing and processing method and cross-language information system |
CN111737550A (en) * | 2019-03-25 | 2020-10-02 | 阿里巴巴集团控股有限公司 | Search result processing method and device, storage medium and processor |
CN112287217A (en) * | 2020-10-23 | 2021-01-29 | 平安科技(深圳)有限公司 | Medical literature retrieval method, device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090248422A1 (en) * | 2008-03-28 | 2009-10-01 | Microsoft Corporation | Intra-language statistical machine translation |
CN102651003A (en) * | 2011-02-28 | 2012-08-29 | 北京百度网讯科技有限公司 | Cross-language searching method and device |
CN102779135A (en) * | 2011-05-13 | 2012-11-14 | 北京百度网讯科技有限公司 | Method and device for obtaining cross-linguistic search resources and corresponding search method and device |
CN103838774A (en) * | 2012-11-26 | 2014-06-04 | 英业达科技有限公司 | Webpage inquiring system and inquiring method thereof |
-
2017
- 2017-01-13 CN CN201710025472.6A patent/CN108304412B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090248422A1 (en) * | 2008-03-28 | 2009-10-01 | Microsoft Corporation | Intra-language statistical machine translation |
CN102651003A (en) * | 2011-02-28 | 2012-08-29 | 北京百度网讯科技有限公司 | Cross-language searching method and device |
CN102779135A (en) * | 2011-05-13 | 2012-11-14 | 北京百度网讯科技有限公司 | Method and device for obtaining cross-linguistic search resources and corresponding search method and device |
CN103838774A (en) * | 2012-11-26 | 2014-06-04 | 英业达科技有限公司 | Webpage inquiring system and inquiring method thereof |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334526A (en) * | 2017-01-20 | 2018-07-27 | 北京搜狗科技发展有限公司 | The methods of exhibiting and device of search result items |
WO2019109663A1 (en) * | 2017-12-08 | 2019-06-13 | 北京搜狗科技发展有限公司 | Cross-language search method and apparatus, and apparatus for cross-language search |
CN110930208A (en) * | 2018-09-19 | 2020-03-27 | 阿里巴巴集团控股有限公司 | Object searching method and device |
CN110930208B (en) * | 2018-09-19 | 2023-05-05 | 阿里巴巴集团控股有限公司 | Object searching method and device |
CN111368117A (en) * | 2018-12-26 | 2020-07-03 | 财团法人工业技术研究院 | Cross-language information constructing and processing method and cross-language information system |
CN111368117B (en) * | 2018-12-26 | 2023-05-30 | 财团法人工业技术研究院 | Cross-language information construction and processing method and cross-language information system |
CN111737550A (en) * | 2019-03-25 | 2020-10-02 | 阿里巴巴集团控股有限公司 | Search result processing method and device, storage medium and processor |
CN111737550B (en) * | 2019-03-25 | 2024-01-23 | 阿里巴巴集团控股有限公司 | Search result processing method and device, storage medium and processor |
CN112287217A (en) * | 2020-10-23 | 2021-01-29 | 平安科技(深圳)有限公司 | Medical literature retrieval method, device, electronic equipment and storage medium |
CN112287217B (en) * | 2020-10-23 | 2023-08-04 | 平安科技(深圳)有限公司 | Medical document retrieval method, medical document retrieval device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108304412B (en) | 2022-09-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10515627B2 (en) | Method and apparatus of building acoustic feature extracting model, and acoustic feature extracting method and apparatus | |
CN108304412A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
KR102544453B1 (en) | Method and device for processing information, and storage medium | |
CN111465918B (en) | Method for displaying service information in preview interface and electronic equipment | |
CN113792207B (en) | Cross-modal retrieval method based on multi-level feature representation alignment | |
CN107784034B (en) | Page type identification method and device for page type identification | |
CN108121736A (en) | A kind of descriptor determines the method for building up, device and electronic equipment of model | |
CN113515942A (en) | Text processing method and device, computer equipment and storage medium | |
CN112269853B (en) | Retrieval processing method, device and storage medium | |
CN114238690A (en) | Video classification method, device and storage medium | |
CN108958503A (en) | input method and device | |
CN111428522B (en) | Translation corpus generation method, device, computer equipment and storage medium | |
CN107870904A (en) | A kind of interpretation method, device and the device for translation | |
CN108345625B (en) | Information mining method and device for information mining | |
CN108255940A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
CN108255939A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
CN116166843B (en) | Text video cross-modal retrieval method and device based on fine granularity perception | |
CN108322770B (en) | Video program identification method, related device, equipment and system | |
CN106919642A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
CN113033163A (en) | Data processing method and device and electronic equipment | |
CN112100501A (en) | Information flow processing method and device and electronic equipment | |
CN115526602A (en) | Memo reminding method, device, terminal and storage medium | |
CN111428523B (en) | Translation corpus generation method, device, computer equipment and storage medium | |
CN111222011B (en) | Video vector determining method and device | |
CN110929122B (en) | Data processing method and device for data processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |