CN103699524A - Word segmentation method and mobile terminal - Google Patents

Word segmentation method and mobile terminal Download PDF

Info

Publication number
CN103699524A
CN103699524A CN201310699645.4A CN201310699645A CN103699524A CN 103699524 A CN103699524 A CN 103699524A CN 201310699645 A CN201310699645 A CN 201310699645A CN 103699524 A CN103699524 A CN 103699524A
Authority
CN
China
Prior art keywords
participle
mobile terminal
dictionary
word segmentation
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310699645.4A
Other languages
Chinese (zh)
Inventor
龚龙
任大韫
霍岩
姜涛
高庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310699645.4A priority Critical patent/CN103699524A/en
Publication of CN103699524A publication Critical patent/CN103699524A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a word segmentation method and a mobile terminal. The word segmentation method comprises the steps of: receiving a word segmentation request carrying with a target file sent from a local word segmentation client of a mobile terminal by a local word segmentation server end of the mobile terminal; performing word segmentation calculation on the target file based on a lexicon corresponding to the mobile terminal and a word segmentation algorithm corresponding to the lexicon to obtain a word segmentation result. Therefore, a word segmentation technology can be used on the mobile terminal, and the application field of the word segmentation technology is enlarged.

Description

Segmenting method and mobile terminal
Technical field
The present invention relates to computer technology, relate in particular to a kind of segmenting method and mobile terminal.
Background technology
Along with the development of computer technology, participle technique has been widely used in the fields such as search engine, mechanical translation, phonetic synthesis, autoabstract.Wherein, participle (Chinese Word Segmentation) technology refers to one or one section of Chinese text is cut into the technology of Chinese word one by one.Meanwhile, along with take the universal rapidly of mobile terminal that smart mobile phone and panel computer be representative, on mobile terminal, use the demand of participle technique also in continuous increase, such as, on mobile terminal, draw word search, and interactive voice etc.
The participle of the prior art storehouse of increasing income, such as, IKAnalyzer be one increase income, the participle kit of the lightweight based on JAVA language development, the increase income dictionary file in storehouse of this participle is larger, internal memory shared during operation is many, is mainly used in PC or server.
But the increase income dictionary file in storehouse of above-mentioned participle is too large, internal memory shared during operation is too many, can not be applied directly on mobile terminal, and meanwhile, the operating system of existing mobile terminal is not also supported the development interface of the participle on mobile terminal.
Summary of the invention
In view of this, the object of the embodiment of the present invention is to propose a kind of segmenting method and mobile terminal, makes it possible to use participle technique on mobile terminal.
First aspect, the embodiment of the present invention provides a kind of segmenting method, and described method comprises:
The Chinese Word Segmentation Service device termination of mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality;
Described Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, and obtains word segmentation result.
Second aspect, the embodiment of the present invention provides a kind of mobile terminal, and described mobile terminal comprises participle client and Chinese Word Segmentation Service device end:
Described participle client comprises:
Collector unit, for collecting participle request, and is transferred to transmitting element by described participle request;
Transmitting element, is sent to Chinese Word Segmentation Service device end for the participle request that collector unit is collected;
Described Chinese Word Segmentation Service device end comprises:
Receiving element, the described participle request that carries file destination sending for receiving described participle client, and described participle request is transferred to participle unit;
Participle unit, for receiving described participle request from described receiving element, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, and described word segmentation result is transferred to transmitting element;
Transmitting element, for receiving described word segmentation result from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.
The embodiment of the present invention is by the participle request that carries file destination of the participle client transmission of Chinese Word Segmentation Service device end mobile terminal receive this locality of mobile terminal this locality, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the segmenting method of first embodiment of the invention;
Fig. 2 is the process flow diagram of the segmenting method of second embodiment of the invention;
Fig. 3 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 4 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 5 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 6 is the process flow diagram of the segmenting method of third embodiment of the invention;
Fig. 7 is the schematic diagram of the mobile terminal of fourth embodiment of the invention;
Fig. 8 is the schematic diagram of the participle client of fourth embodiment of the invention;
Fig. 9 is the schematic diagram of the Chinese Word Segmentation Service device end of fourth embodiment of the invention.
Embodiment
In order to make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, the specific embodiment of the invention is described in further detail.Be understandable that, specific embodiment described herein is only for explaining the present invention, but not limitation of the invention.It also should be noted that, for convenience of description, in accompanying drawing, only show part related to the present invention but not full content.
Fig. 1 is the process flow diagram of the segmenting method of first embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.Configurable Chinese Word Segmentation Service device end and at least one participle client in mobile terminal, Chinese Word Segmentation Service device end is exclusively used in and carries out word segmentation processing, also be about to word segmentation processing function package to independently in service process, main cause is to guarantee to only have the copy of a dictionary file in internal memory, thereby the internal memory use amount of the time of running is dropped to minimum.Participle client is mainly used in proposing participle request and obtains word segmentation result, can be any application software client that has word segmentation processing demand.The scheme of the present embodiment is brought in execution by Chinese Word Segmentation Service device, and as shown in Figure 1, described method comprises:
The Chinese Word Segmentation Service device termination of step 110, mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality.
Particularly, the participle request that carries file destination that the participle client that the Chinese Word Segmentation Service device end on mobile terminal can reception itself sends.Wherein, described participle request comprise following at least one: the participle request of directly calling, by the participle request of C or C++ interface interchange and by the participle request of JAVA interface interchange.Wherein, C or C++ and JAVA are all the programming languages of computer realm.
Step 120, Chinese Word Segmentation Service device end are received that according to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary file destination carries out participle calculating, and are obtained word segmentation result.
Particularly, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, can be according to the dictionary of its loading, the algorithm corresponding with the dictionary of loading by file destination, such as, one section of Chinese text, cutting is Chinese word one by one, thereby obtains word segmentation result.
In a preferred implementation of the present embodiment, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends in step 110, also comprise: the Chinese Word Segmentation Service device end in mobile terminal, in when operation, start Chinese Word Segmentation Service process, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
Particularly, after mobile terminal starts, before the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, Chinese Word Segmentation Service device end starts Chinese Word Segmentation Service process, temporarily do not load the dictionary corresponding with described mobile terminal, thereby reduce the memory headroom that takies mobile terminal.This Chinese Word Segmentation Service process is the detached process that a mobile terminal-opening starts, and when Chinese Word Segmentation Service process is when moving tense, can receive the participle request that participle client sends, and carry out participle calculating.In addition, because the shared memory headroom of Chinese Word Segmentation Service process is very little, and the shared memory headroom of the dictionary corresponding with described mobile terminal will be very large, so enable after Chinese Word Segmentation Service process, do not load immediately the dictionary corresponding with described mobile terminal, and reload described dictionary after only receiving the participle request of participle client.
In another preferred implementation of the present embodiment, the Chinese Word Segmentation Service device termination in step 110 in mobile terminal is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality, specifically comprises:
Chinese Word Segmentation Service device termination in mobile terminal is received the participle request that the participle client of described mobile terminal this locality is directly called Chinese Word Segmentation Service device end; Or the Chinese Word Segmentation Service device termination in described mobile terminal is received the participle client of described mobile terminal this locality by the participle request of C or C++ interface interchange Chinese Word Segmentation Service device end; Or the Chinese Word Segmentation Service device termination in described mobile terminal is received the participle client of described mobile terminal this locality by the participle request of JAVA interface interchange Chinese Word Segmentation Service device end.
In another preferred implementation of the present embodiment, before in step 120, Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, also comprise: Chinese Word Segmentation Service device end judges whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
Particularly, Chinese Word Segmentation Service device end judges whether to load the process of the dictionary corresponding with described mobile terminal.When not having to load the dictionary corresponding with described mobile terminal, load described dictionary, and carry out participle calculating according to described dictionary and a minute word algorithm corresponding to described dictionary, thereby obtain word segmentation result; When loading described corresponding with described mobile terminal dictionary, according to described dictionary and a minute word algorithm corresponding to described dictionary, carry out participle calculating, thereby obtain word segmentation result.
Above-mentioned deterministic process is because the dictionary corresponding with mobile terminal, not forever in stress state, when not receiving the participle request of described participle client transmission within the default time, unloads described dictionary.In other words, mobile terminal is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another preferred implementation of the present embodiment, described segmenting method also comprises: server is chosen and obtained the corresponding dictionary of described mobile terminal according to the hardware configuration of described mobile terminal; Described server pushes to corresponding mobile terminal by described dictionary and stores.Present embodiment has further utilized the external resource of network side server to provide dictionary to mobile terminal.Wherein, the method for choosing dictionary can have a variety of, only describes in embodiments of the present invention wherein a kind of detailed process of method in detail, and other method for optimizing here no longer describe in detail.
Illustrate server below and according to the hardware configuration of mobile terminal, choose the process of dictionary.
(1) server is divided total dictionary, obtains a plurality of dissimilar dictionaries.
Because the general committed memory of participle dictionary using in prior art is all very large, such as, 200MB, therefore cannot be transplanted to large dictionary file like this on mobile terminal, such as, mobile phone, so need to be optimized existing dictionary.Its optimizing process can be:
Chinese vocabulary bank is divided into general word storehouse and the large class of exclusive thesaurus two.General word storehouse is exactly our daily conventional Chinese terms of using, and its size is smaller, can also be compressed to 1-2MB; Exclusive noun can carry out classification segmentation again, such as science and technology, and history, personage, art etc., such as art can be further subdivided into again film, song, literature etc.For the concrete application scenarios of Chinese word segmentation, select the dissimilar dictionary that priority is high, form exclusive noun dictionary, be defined as sub-dictionary here.
In addition, because the hardware configuration of different mobile terminal is different, different mobile terminal has different demands to dictionary file.There are some compared with the read-only memory of the mobile terminal of low side (Read-Only Memory, ROM) less, dictionary file is had to comparatively strict restriction, need to from exclusive noun dictionary, choose and can meet consumers' demand and total size is no more than several dictionaries of certain limitation.The choosing method providing in the embodiment of the present invention is as follows: if total size of the required dictionary of mobile terminal can not surpass M, there is w (1), w (2) ... w (k) is k dictionary altogether, and the file size of each dictionary is m (1), m (2) ... m (k).
2) calculate the initial value of each dictionary.The number of times how dictionary is hit in participle is more, and it is more valuable.Suppose that dictionary w (i) is hit h time in conventional web search participle, altogether carried out the Webpage search participle of H time, the computing formula of the initial value init_value (i) of this dictionary w (i) as shown in Equation (1) so.
Init_value (i)=(h/H) * 100 formula (1)
Wherein, init_value (i) represents the average hit-count of dictionary w (i) in 100 search participles.
3) according to the value of the initial value of each dictionary and participle record calculating dictionary.Wherein, participle record is by collecting the participle record of all participle clients on mobile terminal, thereby calculates the value of dictionary.Suppose that all participle clients have carried out N time participle on mobile terminal, it is inferior that wherein dictionary w (i) has hit n (i), and the computing formula of the value v (i) of dictionary w (i) as shown in Equation (2).
V (i)=100 * [n (i)+β * init_value (i)]/(N+ β * 100) formula (2)
Wherein, β is for regulating parameter, and this value is larger, shows that the influence of initial value is larger.The participle number of times carrying out on mobile terminal when participle client is more, the more approaching statistical figure that this dictionary w (i) is hit by participle under the real scene of mobile terminal of this β value.
4) according to the memory size of the file size of the value of each dictionary and described dictionary and described mobile terminal, choose the dictionary corresponding with mobile terminal.Wherein, the dictionary committed memory amount of selected taking-up is less than total amount of ram of mobile terminal, preferentially selects the value of dictionary large, and the little dictionary of the file of dictionary.In addition, can also adopt greedy algorithm to choose dictionary, choose [v (i)/m (i)] at every turn and be worth maximum dictionary, if the value of [v (i)/m (i)] is close, the preferential little dictionary of m (i) of selecting, until reach the total big or small M of the required dictionary of mobile terminal.
In another preferred implementation of the present embodiment, after loading the dictionary corresponding with mobile terminal, also comprise: when Chinese Word Segmentation Service device end is judged the participle request that does not receive one or more participle clients transmissions within the default time, unload described dictionary.In other words, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
The experimental result of segmenting method on the mobile terminal that the embodiment of the present invention provides on Android platform is specially: all sizes of carrying out dynamic base dictionary file are 7MB; The physical memory taking the time of running is 8MB; To the response time of the request of Chinese word segmentation far below 1ms; Accuracy for basic participle is greater than 80%, and the participle accuracy of exclusive noun is greater than 87%.
Therefore, the segmenting method that the embodiment of the present invention provides, the participle request that carries file destination that the Chinese Word Segmentation Service device termination contracture word client of mobile terminal this locality sends, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique.
Fig. 2 is the process flow diagram of the segmenting method of second embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.As shown in Figure 2, described method comprises:
The participle request that carries file destination that the participle client of Chinese Word Segmentation Service device end mobile terminal receive this locality of step 210, mobile terminal this locality sends.
Particularly, the participle client of mobile terminal this locality can carry by different interfaces the participle request of file destination to the Chinese Word Segmentation Service device end transmission of mobile terminal this locality.
Step 220, Chinese Word Segmentation Service device end judge whether to load the described dictionary corresponding with described mobile terminal, if so, perform step 240; If not, perform step 230.
Step 230, Chinese Word Segmentation Service device end are loaded into the corresponding dictionary of mobile terminal in local internal memory.
Step 240, Chinese Word Segmentation Service device end are received that according to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary file destination carries out participle calculating, and are obtained word segmentation result.
Particularly, the word segmentation result that Chinese Word Segmentation Service device termination contracture word server end sends, this word segmentation result is Chinese Word Segmentation Service device end according to the dictionary of its loading, the algorithm corresponding with the dictionary loading by file destination, such as, one section of Chinese text, cutting is Chinese word one by one, thus the word segmentation result obtaining.
The word segmentation result that step 250, Chinese Word Segmentation Service device end calculate participle sends to participle client, so that described participle client identifies according to described word segmentation result the file destination that it need to be identified.
When step 260, Chinese Word Segmentation Service device end are judged the participle request that does not receive the transmission of participle client within the default time, unload described dictionary.
Particularly, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In a preferred implementation of the present embodiment, the local participle client of mobile terminal can carry by different interfaces the participle request of file destination to the Chinese Word Segmentation Service device end transmission of mobile terminal this locality, therefore, Chinese Word Segmentation Service device end can receive one or more participle requests that participle client sends by different interfaces.Such as, by the mode directly called, to described method Chinese Word Segmentation Service device end, send participle request, mode by C or C++ interface interchange and send participle request and the mode by JAVA interface interchange sends at least one in participle request to described Chinese Word Segmentation Service device end to described Chinese Word Segmentation Service device end.
Describing participle customer end adopted distinct interface below in detail sends participle request and makes Chinese Word Segmentation Service device end carry out the implementation procedure of Chinese Word Segmentation Service to Chinese Word Segmentation Service device end, wherein, in participle client, there is a participle client library, this participle client library is the dynamic base that participle client offers user, and this storehouse has encapsulated the function of participle client call service end.The work that this participle client library completes is the packing encapsulation to participle request msg, and striding course sends message to the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end, and participle client need to be considered the problem of thread-safe.
(1) by the mode of directly calling, to described Chinese Word Segmentation Service device end, send participle request.This mode is the most direct, and the application program of participle client can directly be called participle client library, thereby starts the Chinese Word Segmentation Service of Chinese Word Segmentation Service device end.As shown in Figure 3, the application program 31 in participle client is directly called participle client library 32, the participle request msg that this participle client library 32 application programs 31 the produce encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.
(2) mode by C or C++ interface interchange sends participle request to Chinese Word Segmentation Service device end.This mode is applied in the scene of webpage view (Webview), and the application program of participle client is by C or C++ interface interchange participle client library, thus the Chinese Word Segmentation Service of startup mobile terminal.Wherein, Webview is a kind of view that is used for the display web page page, and the Chinese on webpage is carried out to participle, thereby facilitates user to draw word search.As shown in Figure 4, application program 41 in participle client is called participle client library 43 by C or C++ interface 42, the participle request msg that this participle client library 43 application programs 41 the produce encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.
(3) mode by JAVA interface interchange sends participle request to described Chinese Word Segmentation Service device end.This mode is applied in the scene of text view (TextView), and in this mode, mobile terminal not only provides participle client library, also on the upper strata of participle client library, provides participle JNI storehouse.The effect in this JNI storehouse is to be used to provide JAVA layer calling to C++ layer.In addition, in order to facilitate calling of JAVA layer, also provide the help class of a JAVA, made the user of JAVA layer can directly use the function interface of JAVA.Its concrete invoked procedure is: first call the JAVA layer interface that JAVA helps class, then JAVA helps class can call the C interface that participle JNI storehouse provides, finally by C interface, call participle client library, thereby start the Chinese Word Segmentation Service of mobile terminal.Wherein, TextView is a kind of view that is used for showing text message, also has the demand of drawing word search.As shown in Figure 5, application program 51 in participle client helps the JAVA of class to help class 52, JNI storehouse 53, call participle client library by JAVA, the participle request msg that this participle client library application programs 51 the produces encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.Wherein, JNI storehouse 53 is used to provide JAVA layer calling to C++ layer.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.Wherein, JNI storehouse 53 is used to provide JAVA layer calling to C++ layer.
Certainly, those skilled in the art can understand participle client and can also adopt other interface to send participle request to Chinese Word Segmentation Service device end, just describe above-mentioned three kinds of modes here in detail, and the explanation no longer one by one here of other modes.
Therefore, the segmenting method that the embodiment of the present invention provides, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, according to the dictionary corresponding with described mobile terminal, and a minute word algorithm corresponding to described dictionary carries out participle calculating to described file destination, and obtain word segmentation result, described word segmentation result is sent to described participle client, and while not receiving the participle request that participle client sends within the default time, unload dictionary, realized and on mobile terminal, carried out participle, expanded the application of participle technique, also improved the utilization factor of the internal memory of mobile terminal.
Fig. 6 is the process flow diagram of the segmenting method of third embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.As shown in Figure 6, described method comprises:
Step 601, mobile terminal start.
The Chinese Word Segmentation Service device end of step 602, mobile terminal this locality starts Chinese Word Segmentation Service process, does not temporarily load the dictionary corresponding with mobile terminal, thereby reduces the memory headroom that takies mobile terminal.
The local participle client awaits participle of step 603, mobile terminal calls.
The application call participle of step 604, participle client, produces and carries the participle request of file destination, and adopt the participle client library of different interface interchange participle clients.Wherein, participle client can send participle request to Chinese Word Segmentation Service device end by the mode of directly calling, or the mode by C or C++ interface interchange sends participle request to Chinese Word Segmentation Service device end, or to described Chinese Word Segmentation Service device end, send participle request by the mode of JAVA interface interchange.
Step 605, the participle client library data that request is carried to participle are the file destination encapsulation of packing, and striding course sends the participle request after encapsulation at the Chinese Word Segmentation Service device end of mobile terminal this locality.
Step 606, Chinese Word Segmentation Service device termination are received after participle request, judge whether it has loaded the dictionary corresponding with mobile terminal.When not loading the corresponding dictionary of mobile terminal, perform step 607; When loading the corresponding dictionary of mobile terminal, perform step 608.This deterministic process is because the dictionary corresponding with mobile terminal, not forever in stress state, when not receiving the participle request of described participle client transmission within the default time, unloads described dictionary.In other words, mobile terminal is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
Step 607, Chinese Word Segmentation Service device end load the corresponding dictionary of mobile terminal.
Step 608, Chinese Word Segmentation Service device end are closed existing timer, start new timer.Wherein, start new timer and can realize its dictionary having loaded of regular unloading.
Step 609, Chinese Word Segmentation Service device end carry out participle calculating by file destination, and obtain word segmentation result according to the dictionary of its loading, the algorithm corresponding with the dictionary loading.Wherein, the dictionary of loading is chosen acquisition according to the hardware configuration of mobile terminal.
Step 610, Chinese Word Segmentation Service device end return to word segmentation result the participle client of mobile terminal this locality.
Step 611, Chinese Word Segmentation Service device end regularly judge whether timer reaches predefined time value, when reaching predefined time value, and execution step 612; When not reaching predefined time value, repeated execution of steps 611.Its objective is its dictionary having loaded of regular unloading, thereby reduce the space of committed memory.
Step 612, Chinese Word Segmentation Service device end unload the dictionary that it has loaded.
Therefore, the segmenting method that the embodiment of the present invention provides, has realized and on mobile terminal, has carried out participle, has expanded the application of participle technique.
Fig. 7 is the schematic diagram of the mobile terminal of fourth embodiment of the invention.This mobile terminal is for carrying out the segmenting method of first embodiment of the invention and the 3rd embodiment.As shown in Figure 7, described mobile terminal 70 comprises: participle client 71 and Chinese Word Segmentation Service device end 72.
Participle client 71 comprises: collector unit 81 and transmitting element 82, as shown in Figure 8.
Collector unit 81 is for collecting participle request, and described participle request is transferred to transmitting element 82.
Transmitting element 82 is sent to Chinese Word Segmentation Service device end 72 for the participle request that collector unit is collected.
Chinese Word Segmentation Service end 72 comprises: receiving element 91, participle unit 92 and transmitting element 93, as shown in Figure 9.
The described participle request that carries file destination that receiving element 91 sends for receiving participle client 71, and described participle request is transferred to participle unit 92.
Particularly, the participle request that carries file destination that a plurality of participle clients that receiving element 91 can mobile terminal receive this locality send.In addition, before receiving element 91 receives the participle request that carries file destination of participle client 71 transmissions, do not load the dictionary corresponding with mobile terminal 70, thereby reduce the memory headroom that takies mobile terminal.Wherein, this Chinese Word Segmentation Service process is the detached process that a mobile terminal-opening starts, and when Chinese Word Segmentation Service process is when moving tense, can receive the participle request that participle client sends, and carry out participle calculating.In addition, because the shared memory headroom of Chinese Word Segmentation Service process is very little, and the shared memory headroom of the dictionary corresponding with described mobile terminal will be very large, so enable after Chinese Word Segmentation Service process, do not load immediately the dictionary corresponding with described mobile terminal, and reload described dictionary after only receiving the participle request of participle client.
Participle unit 92 is for receiving described participle request from described receiving element 91, according to described participle request, utilize minute word algorithm that the dictionary corresponding with described mobile terminal and described dictionary are corresponding to carry out participle calculating to described file destination, obtain word segmentation result, and described word segmentation result is transferred to transmitting element 93.
Transmitting element 93, for the described word segmentation result of 92 reception from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.Wherein, described participle request comprise following at least one: the participle request of directly calling, by the participle request of C or C++ interface interchange and by the participle request of JAVA interface interchange.
In one embodiment, receiving element 91 directly calls the participle request of Chinese Word Segmentation Service device end specifically for receiving the participle client of described mobile terminal this locality; Or, receive the participle client of described mobile terminal this locality by the participle request of C or C++ interface interchange Chinese Word Segmentation Service device end; Or, receive the participle client of described mobile terminal this locality by the participle request of JAVA interface interchange Chinese Word Segmentation Service device end.
In another embodiment, Chinese Word Segmentation Service device end 72 provided by the invention also comprises: loading unit 94.
Loading unit 94 is for before carrying out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, judge whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
Particularly, when not having to load described corresponding with described mobile terminal dictionary, load described dictionary, and carry out participle calculating according to described dictionary and a minute word algorithm corresponding to described dictionary, thereby obtain word segmentation result; When loading described corresponding with described mobile terminal dictionary, according to described dictionary and a minute word algorithm corresponding to described dictionary, carry out participle calculating, thereby obtain word segmentation result.This process is the process that judges whether to load the dictionary corresponding with described mobile terminal.This deterministic process is because the dictionary corresponding with mobile terminal, not forever in stress state, when not receiving the participle request of described participle client transmission within the default time, unloads described dictionary.In other words, mobile terminal is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another embodiment, described Chinese Word Segmentation Service device end 72 also comprises: unloading unit 95.
When unloading unit 95 does not receive for judging the participle request that participle client sends within the default time, unload described dictionary.In other words, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another embodiment, receiving element 91 is specifically for start Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
Therefore, the mobile terminal that the embodiment of the present invention provides, by participle client, collect participle request, and send described participle request to Chinese Word Segmentation Service device end, after the request of Chinese Word Segmentation Service device termination contracture word, according to the dictionary corresponding with described mobile terminal, and a minute word algorithm corresponding to described dictionary carries out participle calculating to described file destination, and obtain word segmentation result, described word segmentation result is sent to described participle client, so that described participle client is identified described file destination according to described word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique, also improved the utilization factor of the internal memory of mobile terminal.
Obviously, it will be understood by those skilled in the art that above-mentioned each module of the present invention or each step can implement by communication terminal as above.Alternatively, the embodiment of the present invention can realize by the executable program of computer installation, thereby they can be stored in memory storage and be carried out by processor, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be ROM (read-only memory), disk or CD etc.; Or they are made into respectively to each integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module realize.Like this, the present invention is not restricted to the combination of any specific hardware and software.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, to those skilled in the art, the present invention can have various changes and variation.All any modifications of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included within spirit of the present invention and principle.

Claims (10)

1. a segmenting method, is characterized in that, described method comprises:
The Chinese Word Segmentation Service device termination of mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality;
Described Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, and obtains word segmentation result.
2. segmenting method according to claim 1, is characterized in that, the dictionary that described Chinese Word Segmentation Service device end basis is corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary also comprise before described file destination is carried out to participle calculating:
Described Chinese Word Segmentation Service device end judges whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
3. segmenting method according to claim 1 and 2, is characterized in that, also comprises:
Server is chosen and is obtained the corresponding dictionary of described mobile terminal according to the hardware configuration of described mobile terminal;
Described server pushes to corresponding mobile terminal by described dictionary and stores.
4. segmenting method according to claim 3, is characterized in that, described server is chosen the corresponding dictionary of the described mobile terminal of acquisition according to the hardware configuration of described mobile terminal and comprised:
Described server is divided total dictionary, obtains a plurality of dissimilar dictionaries;
Described server calculates the initial value of dictionary described in each, and the value of calculating dictionary described in each according to the initial value of described dictionary and participle record;
Described server is chosen the dictionary corresponding with described mobile terminal according to the memory size of the file size of the value of dictionary described in each, described dictionary and described mobile terminal.
5. segmenting method according to claim 2, is characterized in that, described Chinese Word Segmentation Service device end also comprises after loading described dictionary:
When described Chinese Word Segmentation Service device end is judged the participle request that does not receive the transmission of participle client within the default time, unload described dictionary.
6. according to the segmenting method described in claim 1 to 5 any one, it is characterized in that, the participle request that carries file destination that the participle client of Chinese Word Segmentation Service device end mobile terminal receive this locality of described mobile terminal this locality sends, comprising:
The Chinese Word Segmentation Service device end of described mobile terminal this locality, starts Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
7. a mobile terminal, is characterized in that, comprises participle client and Chinese Word Segmentation Service device end,
Described participle client comprises:
Collector unit, for collecting participle request, and is transferred to transmitting element by described participle request;
Transmitting element, is sent to Chinese Word Segmentation Service device end for the participle request that collector unit is collected;
Described Chinese Word Segmentation Service device end comprises:
Receiving element, the described participle request that carries file destination sending for receiving described participle client, and described participle request is transferred to participle unit;
Participle unit, for receiving described participle request from described receiving element, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, and described word segmentation result is transferred to transmitting element;
Transmitting element, for receiving described word segmentation result from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.
8. mobile terminal according to claim 7, is characterized in that, also comprises:
Loading unit, before described file destination being carried out to participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, judge whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
9. according to the mobile terminal described in claim 7 or 8, it is characterized in that, also comprise:
Unloading unit, while not receiving for judging the participle request that participle client sends within the default time, unloads described dictionary.
10. according to the mobile terminal described in claim 7 to 9 any one, it is characterized in that, described receiving element is specifically for start Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
CN201310699645.4A 2013-12-18 2013-12-18 Word segmentation method and mobile terminal Pending CN103699524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310699645.4A CN103699524A (en) 2013-12-18 2013-12-18 Word segmentation method and mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310699645.4A CN103699524A (en) 2013-12-18 2013-12-18 Word segmentation method and mobile terminal

Publications (1)

Publication Number Publication Date
CN103699524A true CN103699524A (en) 2014-04-02

Family

ID=50361055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310699645.4A Pending CN103699524A (en) 2013-12-18 2013-12-18 Word segmentation method and mobile terminal

Country Status (1)

Country Link
CN (1) CN103699524A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622044A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 Segmenting method, device and the equipment of character string
CN109299276A (en) * 2018-11-15 2019-02-01 阿里巴巴集团控股有限公司 One kind converting the text to word insertion, file classification method and device
CN111444716A (en) * 2020-03-30 2020-07-24 深圳市微购科技有限公司 Title word segmentation method, terminal and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080221863A1 (en) * 2007-03-07 2008-09-11 International Business Machines Corporation Search-based word segmentation method and device for language without word boundary tag
CN101373468A (en) * 2007-08-20 2009-02-25 北京搜狗科技发展有限公司 Method for loading word stock, method for inputting character and input method system
CN102624647A (en) * 2012-01-12 2012-08-01 百度在线网络技术(北京)有限公司 Method for processing messages of mobile terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080221863A1 (en) * 2007-03-07 2008-09-11 International Business Machines Corporation Search-based word segmentation method and device for language without word boundary tag
CN101373468A (en) * 2007-08-20 2009-02-25 北京搜狗科技发展有限公司 Method for loading word stock, method for inputting character and input method system
CN102624647A (en) * 2012-01-12 2012-08-01 百度在线网络技术(北京)有限公司 Method for processing messages of mobile terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曾庆祥: "移动终端本地资源搜索引擎的研究与设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622044A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 Segmenting method, device and the equipment of character string
CN109299276A (en) * 2018-11-15 2019-02-01 阿里巴巴集团控股有限公司 One kind converting the text to word insertion, file classification method and device
CN109299276B (en) * 2018-11-15 2021-11-19 创新先进技术有限公司 Method and device for converting text into word embedding and text classification
CN111444716A (en) * 2020-03-30 2020-07-24 深圳市微购科技有限公司 Title word segmentation method, terminal and computer readable storage medium

Similar Documents

Publication Publication Date Title
JP6601470B2 (en) NATURAL LANGUAGE GENERATION METHOD, NATURAL LANGUAGE GENERATION DEVICE, AND ELECTRONIC DEVICE
US20060122836A1 (en) Dynamic switching between local and remote speech rendering
CN105988996B (en) Index file generation method and device
CN108287918B (en) Music playing method and device based on application page, storage medium and electronic equipment
US20150262571A1 (en) Single interface for local and remote speech synthesis
US11240290B2 (en) Application download method and apparatus, application sending method and apparatus, and system
CN104765750A (en) Input language switch method and device for input method application
WO2021109981A1 (en) Information display method and apparatus
CN115994536B (en) Text information processing method, system, equipment and computer storage medium
CN112149404A (en) Method, device and system for identifying risk content of user privacy data
CN103699524A (en) Word segmentation method and mobile terminal
RU2616164C2 (en) Methods and device for browser engine work
CN111046634A (en) Document processing method, document processing device, computer equipment and storage medium
CN111160029A (en) Information processing method and device, electronic equipment and computer readable storage medium
US20210142803A1 (en) Information processing system, method, device and equipment
CN113886033A (en) Task processing method and device
CN103577604B (en) A kind of image index structure for Hadoop distributed environments
CN115495020A (en) File processing method and device, electronic equipment and readable storage medium
US10362082B2 (en) Method for streaming-based distributed media data processing
US20210263961A1 (en) Coarse-to-fine multimodal gallery search system with attention-based neural network models
CN113885969A (en) Embedded device, embedded software loading method and storage medium
CN117455015B (en) Model optimization method and device, storage medium and electronic equipment
CN117421123B (en) GPU resource adjustment method and system, electronic equipment and storage medium
CN117348999B (en) Service execution system and service execution method
CN117592102A (en) Service execution method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140402

RJ01 Rejection of invention patent application after publication