CN103699524A - Word segmentation method and mobile terminal - Google Patents
Word segmentation method and mobile terminal Download PDFInfo
- Publication number
- CN103699524A CN103699524A CN201310699645.4A CN201310699645A CN103699524A CN 103699524 A CN103699524 A CN 103699524A CN 201310699645 A CN201310699645 A CN 201310699645A CN 103699524 A CN103699524 A CN 103699524A
- Authority
- CN
- China
- Prior art keywords
- participle
- mobile terminal
- dictionary
- word segmentation
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses a word segmentation method and a mobile terminal. The word segmentation method comprises the steps of: receiving a word segmentation request carrying with a target file sent from a local word segmentation client of a mobile terminal by a local word segmentation server end of the mobile terminal; performing word segmentation calculation on the target file based on a lexicon corresponding to the mobile terminal and a word segmentation algorithm corresponding to the lexicon to obtain a word segmentation result. Therefore, a word segmentation technology can be used on the mobile terminal, and the application field of the word segmentation technology is enlarged.
Description
Technical field
The present invention relates to computer technology, relate in particular to a kind of segmenting method and mobile terminal.
Background technology
Along with the development of computer technology, participle technique has been widely used in the fields such as search engine, mechanical translation, phonetic synthesis, autoabstract.Wherein, participle (Chinese Word Segmentation) technology refers to one or one section of Chinese text is cut into the technology of Chinese word one by one.Meanwhile, along with take the universal rapidly of mobile terminal that smart mobile phone and panel computer be representative, on mobile terminal, use the demand of participle technique also in continuous increase, such as, on mobile terminal, draw word search, and interactive voice etc.
The participle of the prior art storehouse of increasing income, such as, IKAnalyzer be one increase income, the participle kit of the lightweight based on JAVA language development, the increase income dictionary file in storehouse of this participle is larger, internal memory shared during operation is many, is mainly used in PC or server.
But the increase income dictionary file in storehouse of above-mentioned participle is too large, internal memory shared during operation is too many, can not be applied directly on mobile terminal, and meanwhile, the operating system of existing mobile terminal is not also supported the development interface of the participle on mobile terminal.
Summary of the invention
In view of this, the object of the embodiment of the present invention is to propose a kind of segmenting method and mobile terminal, makes it possible to use participle technique on mobile terminal.
First aspect, the embodiment of the present invention provides a kind of segmenting method, and described method comprises:
The Chinese Word Segmentation Service device termination of mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality;
Described Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, and obtains word segmentation result.
Second aspect, the embodiment of the present invention provides a kind of mobile terminal, and described mobile terminal comprises participle client and Chinese Word Segmentation Service device end:
Described participle client comprises:
Collector unit, for collecting participle request, and is transferred to transmitting element by described participle request;
Transmitting element, is sent to Chinese Word Segmentation Service device end for the participle request that collector unit is collected;
Described Chinese Word Segmentation Service device end comprises:
Receiving element, the described participle request that carries file destination sending for receiving described participle client, and described participle request is transferred to participle unit;
Participle unit, for receiving described participle request from described receiving element, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, and described word segmentation result is transferred to transmitting element;
Transmitting element, for receiving described word segmentation result from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.
The embodiment of the present invention is by the participle request that carries file destination of the participle client transmission of Chinese Word Segmentation Service device end mobile terminal receive this locality of mobile terminal this locality, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the segmenting method of first embodiment of the invention;
Fig. 2 is the process flow diagram of the segmenting method of second embodiment of the invention;
Fig. 3 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 4 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 5 is the schematic diagram of participle request applicable in second embodiment of the invention;
Fig. 6 is the process flow diagram of the segmenting method of third embodiment of the invention;
Fig. 7 is the schematic diagram of the mobile terminal of fourth embodiment of the invention;
Fig. 8 is the schematic diagram of the participle client of fourth embodiment of the invention;
Fig. 9 is the schematic diagram of the Chinese Word Segmentation Service device end of fourth embodiment of the invention.
Embodiment
In order to make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, the specific embodiment of the invention is described in further detail.Be understandable that, specific embodiment described herein is only for explaining the present invention, but not limitation of the invention.It also should be noted that, for convenience of description, in accompanying drawing, only show part related to the present invention but not full content.
Fig. 1 is the process flow diagram of the segmenting method of first embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.Configurable Chinese Word Segmentation Service device end and at least one participle client in mobile terminal, Chinese Word Segmentation Service device end is exclusively used in and carries out word segmentation processing, also be about to word segmentation processing function package to independently in service process, main cause is to guarantee to only have the copy of a dictionary file in internal memory, thereby the internal memory use amount of the time of running is dropped to minimum.Participle client is mainly used in proposing participle request and obtains word segmentation result, can be any application software client that has word segmentation processing demand.The scheme of the present embodiment is brought in execution by Chinese Word Segmentation Service device, and as shown in Figure 1, described method comprises:
The Chinese Word Segmentation Service device termination of step 110, mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality.
Particularly, the participle request that carries file destination that the participle client that the Chinese Word Segmentation Service device end on mobile terminal can reception itself sends.Wherein, described participle request comprise following at least one: the participle request of directly calling, by the participle request of C or C++ interface interchange and by the participle request of JAVA interface interchange.Wherein, C or C++ and JAVA are all the programming languages of computer realm.
Particularly, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, can be according to the dictionary of its loading, the algorithm corresponding with the dictionary of loading by file destination, such as, one section of Chinese text, cutting is Chinese word one by one, thereby obtains word segmentation result.
In a preferred implementation of the present embodiment, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends in step 110, also comprise: the Chinese Word Segmentation Service device end in mobile terminal, in when operation, start Chinese Word Segmentation Service process, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
Particularly, after mobile terminal starts, before the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, Chinese Word Segmentation Service device end starts Chinese Word Segmentation Service process, temporarily do not load the dictionary corresponding with described mobile terminal, thereby reduce the memory headroom that takies mobile terminal.This Chinese Word Segmentation Service process is the detached process that a mobile terminal-opening starts, and when Chinese Word Segmentation Service process is when moving tense, can receive the participle request that participle client sends, and carry out participle calculating.In addition, because the shared memory headroom of Chinese Word Segmentation Service process is very little, and the shared memory headroom of the dictionary corresponding with described mobile terminal will be very large, so enable after Chinese Word Segmentation Service process, do not load immediately the dictionary corresponding with described mobile terminal, and reload described dictionary after only receiving the participle request of participle client.
In another preferred implementation of the present embodiment, the Chinese Word Segmentation Service device termination in step 110 in mobile terminal is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality, specifically comprises:
Chinese Word Segmentation Service device termination in mobile terminal is received the participle request that the participle client of described mobile terminal this locality is directly called Chinese Word Segmentation Service device end; Or the Chinese Word Segmentation Service device termination in described mobile terminal is received the participle client of described mobile terminal this locality by the participle request of C or C++ interface interchange Chinese Word Segmentation Service device end; Or the Chinese Word Segmentation Service device termination in described mobile terminal is received the participle client of described mobile terminal this locality by the participle request of JAVA interface interchange Chinese Word Segmentation Service device end.
In another preferred implementation of the present embodiment, before in step 120, Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, also comprise: Chinese Word Segmentation Service device end judges whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
Particularly, Chinese Word Segmentation Service device end judges whether to load the process of the dictionary corresponding with described mobile terminal.When not having to load the dictionary corresponding with described mobile terminal, load described dictionary, and carry out participle calculating according to described dictionary and a minute word algorithm corresponding to described dictionary, thereby obtain word segmentation result; When loading described corresponding with described mobile terminal dictionary, according to described dictionary and a minute word algorithm corresponding to described dictionary, carry out participle calculating, thereby obtain word segmentation result.
Above-mentioned deterministic process is because the dictionary corresponding with mobile terminal, not forever in stress state, when not receiving the participle request of described participle client transmission within the default time, unloads described dictionary.In other words, mobile terminal is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another preferred implementation of the present embodiment, described segmenting method also comprises: server is chosen and obtained the corresponding dictionary of described mobile terminal according to the hardware configuration of described mobile terminal; Described server pushes to corresponding mobile terminal by described dictionary and stores.Present embodiment has further utilized the external resource of network side server to provide dictionary to mobile terminal.Wherein, the method for choosing dictionary can have a variety of, only describes in embodiments of the present invention wherein a kind of detailed process of method in detail, and other method for optimizing here no longer describe in detail.
Illustrate server below and according to the hardware configuration of mobile terminal, choose the process of dictionary.
(1) server is divided total dictionary, obtains a plurality of dissimilar dictionaries.
Because the general committed memory of participle dictionary using in prior art is all very large, such as, 200MB, therefore cannot be transplanted to large dictionary file like this on mobile terminal, such as, mobile phone, so need to be optimized existing dictionary.Its optimizing process can be:
Chinese vocabulary bank is divided into general word storehouse and the large class of exclusive thesaurus two.General word storehouse is exactly our daily conventional Chinese terms of using, and its size is smaller, can also be compressed to 1-2MB; Exclusive noun can carry out classification segmentation again, such as science and technology, and history, personage, art etc., such as art can be further subdivided into again film, song, literature etc.For the concrete application scenarios of Chinese word segmentation, select the dissimilar dictionary that priority is high, form exclusive noun dictionary, be defined as sub-dictionary here.
In addition, because the hardware configuration of different mobile terminal is different, different mobile terminal has different demands to dictionary file.There are some compared with the read-only memory of the mobile terminal of low side (Read-Only Memory, ROM) less, dictionary file is had to comparatively strict restriction, need to from exclusive noun dictionary, choose and can meet consumers' demand and total size is no more than several dictionaries of certain limitation.The choosing method providing in the embodiment of the present invention is as follows: if total size of the required dictionary of mobile terminal can not surpass M, there is w (1), w (2) ... w (k) is k dictionary altogether, and the file size of each dictionary is m (1), m (2) ... m (k).
2) calculate the initial value of each dictionary.The number of times how dictionary is hit in participle is more, and it is more valuable.Suppose that dictionary w (i) is hit h time in conventional web search participle, altogether carried out the Webpage search participle of H time, the computing formula of the initial value init_value (i) of this dictionary w (i) as shown in Equation (1) so.
Init_value (i)=(h/H) * 100 formula (1)
Wherein, init_value (i) represents the average hit-count of dictionary w (i) in 100 search participles.
3) according to the value of the initial value of each dictionary and participle record calculating dictionary.Wherein, participle record is by collecting the participle record of all participle clients on mobile terminal, thereby calculates the value of dictionary.Suppose that all participle clients have carried out N time participle on mobile terminal, it is inferior that wherein dictionary w (i) has hit n (i), and the computing formula of the value v (i) of dictionary w (i) as shown in Equation (2).
V (i)=100 * [n (i)+β * init_value (i)]/(N+ β * 100) formula (2)
Wherein, β is for regulating parameter, and this value is larger, shows that the influence of initial value is larger.The participle number of times carrying out on mobile terminal when participle client is more, the more approaching statistical figure that this dictionary w (i) is hit by participle under the real scene of mobile terminal of this β value.
4) according to the memory size of the file size of the value of each dictionary and described dictionary and described mobile terminal, choose the dictionary corresponding with mobile terminal.Wherein, the dictionary committed memory amount of selected taking-up is less than total amount of ram of mobile terminal, preferentially selects the value of dictionary large, and the little dictionary of the file of dictionary.In addition, can also adopt greedy algorithm to choose dictionary, choose [v (i)/m (i)] at every turn and be worth maximum dictionary, if the value of [v (i)/m (i)] is close, the preferential little dictionary of m (i) of selecting, until reach the total big or small M of the required dictionary of mobile terminal.
In another preferred implementation of the present embodiment, after loading the dictionary corresponding with mobile terminal, also comprise: when Chinese Word Segmentation Service device end is judged the participle request that does not receive one or more participle clients transmissions within the default time, unload described dictionary.In other words, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
The experimental result of segmenting method on the mobile terminal that the embodiment of the present invention provides on Android platform is specially: all sizes of carrying out dynamic base dictionary file are 7MB; The physical memory taking the time of running is 8MB; To the response time of the request of Chinese word segmentation far below 1ms; Accuracy for basic participle is greater than 80%, and the participle accuracy of exclusive noun is greater than 87%.
Therefore, the segmenting method that the embodiment of the present invention provides, the participle request that carries file destination that the Chinese Word Segmentation Service device termination contracture word client of mobile terminal this locality sends, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique.
Fig. 2 is the process flow diagram of the segmenting method of second embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.As shown in Figure 2, described method comprises:
The participle request that carries file destination that the participle client of Chinese Word Segmentation Service device end mobile terminal receive this locality of step 210, mobile terminal this locality sends.
Particularly, the participle client of mobile terminal this locality can carry by different interfaces the participle request of file destination to the Chinese Word Segmentation Service device end transmission of mobile terminal this locality.
Particularly, the word segmentation result that Chinese Word Segmentation Service device termination contracture word server end sends, this word segmentation result is Chinese Word Segmentation Service device end according to the dictionary of its loading, the algorithm corresponding with the dictionary loading by file destination, such as, one section of Chinese text, cutting is Chinese word one by one, thus the word segmentation result obtaining.
The word segmentation result that step 250, Chinese Word Segmentation Service device end calculate participle sends to participle client, so that described participle client identifies according to described word segmentation result the file destination that it need to be identified.
When step 260, Chinese Word Segmentation Service device end are judged the participle request that does not receive the transmission of participle client within the default time, unload described dictionary.
Particularly, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In a preferred implementation of the present embodiment, the local participle client of mobile terminal can carry by different interfaces the participle request of file destination to the Chinese Word Segmentation Service device end transmission of mobile terminal this locality, therefore, Chinese Word Segmentation Service device end can receive one or more participle requests that participle client sends by different interfaces.Such as, by the mode directly called, to described method Chinese Word Segmentation Service device end, send participle request, mode by C or C++ interface interchange and send participle request and the mode by JAVA interface interchange sends at least one in participle request to described Chinese Word Segmentation Service device end to described Chinese Word Segmentation Service device end.
Describing participle customer end adopted distinct interface below in detail sends participle request and makes Chinese Word Segmentation Service device end carry out the implementation procedure of Chinese Word Segmentation Service to Chinese Word Segmentation Service device end, wherein, in participle client, there is a participle client library, this participle client library is the dynamic base that participle client offers user, and this storehouse has encapsulated the function of participle client call service end.The work that this participle client library completes is the packing encapsulation to participle request msg, and striding course sends message to the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end, and participle client need to be considered the problem of thread-safe.
(1) by the mode of directly calling, to described Chinese Word Segmentation Service device end, send participle request.This mode is the most direct, and the application program of participle client can directly be called participle client library, thereby starts the Chinese Word Segmentation Service of Chinese Word Segmentation Service device end.As shown in Figure 3, the application program 31 in participle client is directly called participle client library 32, the participle request msg that this participle client library 32 application programs 31 the produce encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.
(2) mode by C or C++ interface interchange sends participle request to Chinese Word Segmentation Service device end.This mode is applied in the scene of webpage view (Webview), and the application program of participle client is by C or C++ interface interchange participle client library, thus the Chinese Word Segmentation Service of startup mobile terminal.Wherein, Webview is a kind of view that is used for the display web page page, and the Chinese on webpage is carried out to participle, thereby facilitates user to draw word search.As shown in Figure 4, application program 41 in participle client is called participle client library 43 by C or C++ interface 42, the participle request msg that this participle client library 43 application programs 41 the produce encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.
(3) mode by JAVA interface interchange sends participle request to described Chinese Word Segmentation Service device end.This mode is applied in the scene of text view (TextView), and in this mode, mobile terminal not only provides participle client library, also on the upper strata of participle client library, provides participle JNI storehouse.The effect in this JNI storehouse is to be used to provide JAVA layer calling to C++ layer.In addition, in order to facilitate calling of JAVA layer, also provide the help class of a JAVA, made the user of JAVA layer can directly use the function interface of JAVA.Its concrete invoked procedure is: first call the JAVA layer interface that JAVA helps class, then JAVA helps class can call the C interface that participle JNI storehouse provides, finally by C interface, call participle client library, thereby start the Chinese Word Segmentation Service of mobile terminal.Wherein, TextView is a kind of view that is used for showing text message, also has the demand of drawing word search.As shown in Figure 5, application program 51 in participle client helps the JAVA of class to help class 52, JNI storehouse 53, call participle client library by JAVA, the participle request msg that this participle client library application programs 51 the produces encapsulation of packing, and striding course sends message to Chinese Word Segmentation Service device end.Wherein, JNI storehouse 53 is used to provide JAVA layer calling to C++ layer.When the Chinese Word Segmentation Service process of Chinese Word Segmentation Service device end receives after the participle request that participle client sends, judge whether the dictionary corresponding with mobile terminal loads, if the dictionary corresponding with mobile terminal do not load, load this dictionary, if the dictionary corresponding with mobile terminal loads, need not repeat to load; According to the dictionary corresponding with mobile terminal and a minute word algorithm docking corresponding to described dictionary, receive that file destination carries out participle calculating, and obtain word segmentation result and described word segmentation result is returned to participle client.Wherein, JNI storehouse 53 is used to provide JAVA layer calling to C++ layer.
Certainly, those skilled in the art can understand participle client and can also adopt other interface to send participle request to Chinese Word Segmentation Service device end, just describe above-mentioned three kinds of modes here in detail, and the explanation no longer one by one here of other modes.
Therefore, the segmenting method that the embodiment of the present invention provides, the participle request that carries file destination that Chinese Word Segmentation Service device termination contracture word client sends, according to the dictionary corresponding with described mobile terminal, and a minute word algorithm corresponding to described dictionary carries out participle calculating to described file destination, and obtain word segmentation result, described word segmentation result is sent to described participle client, and while not receiving the participle request that participle client sends within the default time, unload dictionary, realized and on mobile terminal, carried out participle, expanded the application of participle technique, also improved the utilization factor of the internal memory of mobile terminal.
Fig. 6 is the process flow diagram of the segmenting method of third embodiment of the invention.This segmenting method is applied to mobile terminal.Wherein, mobile terminal comprises smart mobile phone and panel computer etc.As shown in Figure 6, described method comprises:
The Chinese Word Segmentation Service device end of step 602, mobile terminal this locality starts Chinese Word Segmentation Service process, does not temporarily load the dictionary corresponding with mobile terminal, thereby reduces the memory headroom that takies mobile terminal.
The local participle client awaits participle of step 603, mobile terminal calls.
The application call participle of step 604, participle client, produces and carries the participle request of file destination, and adopt the participle client library of different interface interchange participle clients.Wherein, participle client can send participle request to Chinese Word Segmentation Service device end by the mode of directly calling, or the mode by C or C++ interface interchange sends participle request to Chinese Word Segmentation Service device end, or to described Chinese Word Segmentation Service device end, send participle request by the mode of JAVA interface interchange.
Therefore, the segmenting method that the embodiment of the present invention provides, has realized and on mobile terminal, has carried out participle, has expanded the application of participle technique.
Fig. 7 is the schematic diagram of the mobile terminal of fourth embodiment of the invention.This mobile terminal is for carrying out the segmenting method of first embodiment of the invention and the 3rd embodiment.As shown in Figure 7, described mobile terminal 70 comprises: participle client 71 and Chinese Word Segmentation Service device end 72.
Collector unit 81 is for collecting participle request, and described participle request is transferred to transmitting element 82.
Transmitting element 82 is sent to Chinese Word Segmentation Service device end 72 for the participle request that collector unit is collected.
Chinese Word Segmentation Service end 72 comprises: receiving element 91, participle unit 92 and transmitting element 93, as shown in Figure 9.
The described participle request that carries file destination that receiving element 91 sends for receiving participle client 71, and described participle request is transferred to participle unit 92.
Particularly, the participle request that carries file destination that a plurality of participle clients that receiving element 91 can mobile terminal receive this locality send.In addition, before receiving element 91 receives the participle request that carries file destination of participle client 71 transmissions, do not load the dictionary corresponding with mobile terminal 70, thereby reduce the memory headroom that takies mobile terminal.Wherein, this Chinese Word Segmentation Service process is the detached process that a mobile terminal-opening starts, and when Chinese Word Segmentation Service process is when moving tense, can receive the participle request that participle client sends, and carry out participle calculating.In addition, because the shared memory headroom of Chinese Word Segmentation Service process is very little, and the shared memory headroom of the dictionary corresponding with described mobile terminal will be very large, so enable after Chinese Word Segmentation Service process, do not load immediately the dictionary corresponding with described mobile terminal, and reload described dictionary after only receiving the participle request of participle client.
Transmitting element 93, for the described word segmentation result of 92 reception from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.Wherein, described participle request comprise following at least one: the participle request of directly calling, by the participle request of C or C++ interface interchange and by the participle request of JAVA interface interchange.
In one embodiment, receiving element 91 directly calls the participle request of Chinese Word Segmentation Service device end specifically for receiving the participle client of described mobile terminal this locality; Or, receive the participle client of described mobile terminal this locality by the participle request of C or C++ interface interchange Chinese Word Segmentation Service device end; Or, receive the participle client of described mobile terminal this locality by the participle request of JAVA interface interchange Chinese Word Segmentation Service device end.
In another embodiment, Chinese Word Segmentation Service device end 72 provided by the invention also comprises: loading unit 94.
Particularly, when not having to load described corresponding with described mobile terminal dictionary, load described dictionary, and carry out participle calculating according to described dictionary and a minute word algorithm corresponding to described dictionary, thereby obtain word segmentation result; When loading described corresponding with described mobile terminal dictionary, according to described dictionary and a minute word algorithm corresponding to described dictionary, carry out participle calculating, thereby obtain word segmentation result.This process is the process that judges whether to load the dictionary corresponding with described mobile terminal.This deterministic process is because the dictionary corresponding with mobile terminal, not forever in stress state, when not receiving the participle request of described participle client transmission within the default time, unloads described dictionary.In other words, mobile terminal is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another embodiment, described Chinese Word Segmentation Service device end 72 also comprises: unloading unit 95.
When unloading unit 95 does not receive for judging the participle request that participle client sends within the default time, unload described dictionary.In other words, Chinese Word Segmentation Service device end is minimum for the internal memory use amount of the time of running is dropped to, the dictionary that the regular unloading of meeting has loaded, thus soar its internal memory taking.
In another embodiment, receiving element 91 is specifically for start Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
Therefore, the mobile terminal that the embodiment of the present invention provides, by participle client, collect participle request, and send described participle request to Chinese Word Segmentation Service device end, after the request of Chinese Word Segmentation Service device termination contracture word, according to the dictionary corresponding with described mobile terminal, and a minute word algorithm corresponding to described dictionary carries out participle calculating to described file destination, and obtain word segmentation result, described word segmentation result is sent to described participle client, so that described participle client is identified described file destination according to described word segmentation result, realized and on mobile terminal, carried out participle, expanded the application of participle technique, also improved the utilization factor of the internal memory of mobile terminal.
Obviously, it will be understood by those skilled in the art that above-mentioned each module of the present invention or each step can implement by communication terminal as above.Alternatively, the embodiment of the present invention can realize by the executable program of computer installation, thereby they can be stored in memory storage and be carried out by processor, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be ROM (read-only memory), disk or CD etc.; Or they are made into respectively to each integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module realize.Like this, the present invention is not restricted to the combination of any specific hardware and software.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, to those skilled in the art, the present invention can have various changes and variation.All any modifications of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included within spirit of the present invention and principle.
Claims (10)
1. a segmenting method, is characterized in that, described method comprises:
The Chinese Word Segmentation Service device termination of mobile terminal this locality is received the participle request that carries file destination of the participle client transmission of described mobile terminal this locality;
Described Chinese Word Segmentation Service device end carries out participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary to described file destination, and obtains word segmentation result.
2. segmenting method according to claim 1, is characterized in that, the dictionary that described Chinese Word Segmentation Service device end basis is corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary also comprise before described file destination is carried out to participle calculating:
Described Chinese Word Segmentation Service device end judges whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
3. segmenting method according to claim 1 and 2, is characterized in that, also comprises:
Server is chosen and is obtained the corresponding dictionary of described mobile terminal according to the hardware configuration of described mobile terminal;
Described server pushes to corresponding mobile terminal by described dictionary and stores.
4. segmenting method according to claim 3, is characterized in that, described server is chosen the corresponding dictionary of the described mobile terminal of acquisition according to the hardware configuration of described mobile terminal and comprised:
Described server is divided total dictionary, obtains a plurality of dissimilar dictionaries;
Described server calculates the initial value of dictionary described in each, and the value of calculating dictionary described in each according to the initial value of described dictionary and participle record;
Described server is chosen the dictionary corresponding with described mobile terminal according to the memory size of the file size of the value of dictionary described in each, described dictionary and described mobile terminal.
5. segmenting method according to claim 2, is characterized in that, described Chinese Word Segmentation Service device end also comprises after loading described dictionary:
When described Chinese Word Segmentation Service device end is judged the participle request that does not receive the transmission of participle client within the default time, unload described dictionary.
6. according to the segmenting method described in claim 1 to 5 any one, it is characterized in that, the participle request that carries file destination that the participle client of Chinese Word Segmentation Service device end mobile terminal receive this locality of described mobile terminal this locality sends, comprising:
The Chinese Word Segmentation Service device end of described mobile terminal this locality, starts Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
7. a mobile terminal, is characterized in that, comprises participle client and Chinese Word Segmentation Service device end,
Described participle client comprises:
Collector unit, for collecting participle request, and is transferred to transmitting element by described participle request;
Transmitting element, is sent to Chinese Word Segmentation Service device end for the participle request that collector unit is collected;
Described Chinese Word Segmentation Service device end comprises:
Receiving element, the described participle request that carries file destination sending for receiving described participle client, and described participle request is transferred to participle unit;
Participle unit, for receiving described participle request from described receiving element, according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, described file destination is carried out to participle calculating, and obtain word segmentation result, and described word segmentation result is transferred to transmitting element;
Transmitting element, for receiving described word segmentation result from participle unit, sends to described participle client by described word segmentation result, so that described participle client is identified described file destination according to described word segmentation result.
8. mobile terminal according to claim 7, is characterized in that, also comprises:
Loading unit, before described file destination being carried out to participle calculating according to the dictionary corresponding with described mobile terminal and a minute word algorithm corresponding to described dictionary, judge whether to load the described dictionary corresponding with described mobile terminal, if not, the corresponding dictionary of described mobile terminal is loaded in local internal memory.
9. according to the mobile terminal described in claim 7 or 8, it is characterized in that, also comprise:
Unloading unit, while not receiving for judging the participle request that participle client sends within the default time, unloads described dictionary.
10. according to the mobile terminal described in claim 7 to 9 any one, it is characterized in that, described receiving element is specifically for start Chinese Word Segmentation Service process in when operation, the participle request that carries file destination sending with the participle client of monitoring reception mobile terminal this locality.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310699645.4A CN103699524A (en) | 2013-12-18 | 2013-12-18 | Word segmentation method and mobile terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310699645.4A CN103699524A (en) | 2013-12-18 | 2013-12-18 | Word segmentation method and mobile terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103699524A true CN103699524A (en) | 2014-04-02 |
Family
ID=50361055
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310699645.4A Pending CN103699524A (en) | 2013-12-18 | 2013-12-18 | Word segmentation method and mobile terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103699524A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622044A (en) * | 2016-07-13 | 2018-01-23 | 阿里巴巴集团控股有限公司 | Segmenting method, device and the equipment of character string |
CN109299276A (en) * | 2018-11-15 | 2019-02-01 | 阿里巴巴集团控股有限公司 | One kind converting the text to word insertion, file classification method and device |
CN111444716A (en) * | 2020-03-30 | 2020-07-24 | 深圳市微购科技有限公司 | Title word segmentation method, terminal and computer readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080221863A1 (en) * | 2007-03-07 | 2008-09-11 | International Business Machines Corporation | Search-based word segmentation method and device for language without word boundary tag |
CN101373468A (en) * | 2007-08-20 | 2009-02-25 | 北京搜狗科技发展有限公司 | Method for loading word stock, method for inputting character and input method system |
CN102624647A (en) * | 2012-01-12 | 2012-08-01 | 百度在线网络技术(北京)有限公司 | Method for processing messages of mobile terminal |
-
2013
- 2013-12-18 CN CN201310699645.4A patent/CN103699524A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080221863A1 (en) * | 2007-03-07 | 2008-09-11 | International Business Machines Corporation | Search-based word segmentation method and device for language without word boundary tag |
CN101373468A (en) * | 2007-08-20 | 2009-02-25 | 北京搜狗科技发展有限公司 | Method for loading word stock, method for inputting character and input method system |
CN102624647A (en) * | 2012-01-12 | 2012-08-01 | 百度在线网络技术(北京)有限公司 | Method for processing messages of mobile terminal |
Non-Patent Citations (1)
Title |
---|
曾庆祥: "移动终端本地资源搜索引擎的研究与设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622044A (en) * | 2016-07-13 | 2018-01-23 | 阿里巴巴集团控股有限公司 | Segmenting method, device and the equipment of character string |
CN109299276A (en) * | 2018-11-15 | 2019-02-01 | 阿里巴巴集团控股有限公司 | One kind converting the text to word insertion, file classification method and device |
CN109299276B (en) * | 2018-11-15 | 2021-11-19 | 创新先进技术有限公司 | Method and device for converting text into word embedding and text classification |
CN111444716A (en) * | 2020-03-30 | 2020-07-24 | 深圳市微购科技有限公司 | Title word segmentation method, terminal and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6601470B2 (en) | NATURAL LANGUAGE GENERATION METHOD, NATURAL LANGUAGE GENERATION DEVICE, AND ELECTRONIC DEVICE | |
US20060122836A1 (en) | Dynamic switching between local and remote speech rendering | |
CN105988996B (en) | Index file generation method and device | |
CN108287918B (en) | Music playing method and device based on application page, storage medium and electronic equipment | |
US20150262571A1 (en) | Single interface for local and remote speech synthesis | |
US11240290B2 (en) | Application download method and apparatus, application sending method and apparatus, and system | |
CN104765750A (en) | Input language switch method and device for input method application | |
WO2021109981A1 (en) | Information display method and apparatus | |
CN115994536B (en) | Text information processing method, system, equipment and computer storage medium | |
CN112149404A (en) | Method, device and system for identifying risk content of user privacy data | |
CN103699524A (en) | Word segmentation method and mobile terminal | |
RU2616164C2 (en) | Methods and device for browser engine work | |
CN111046634A (en) | Document processing method, document processing device, computer equipment and storage medium | |
CN111160029A (en) | Information processing method and device, electronic equipment and computer readable storage medium | |
US20210142803A1 (en) | Information processing system, method, device and equipment | |
CN113886033A (en) | Task processing method and device | |
CN103577604B (en) | A kind of image index structure for Hadoop distributed environments | |
CN115495020A (en) | File processing method and device, electronic equipment and readable storage medium | |
US10362082B2 (en) | Method for streaming-based distributed media data processing | |
US20210263961A1 (en) | Coarse-to-fine multimodal gallery search system with attention-based neural network models | |
CN113885969A (en) | Embedded device, embedded software loading method and storage medium | |
CN117455015B (en) | Model optimization method and device, storage medium and electronic equipment | |
CN117421123B (en) | GPU resource adjustment method and system, electronic equipment and storage medium | |
CN117348999B (en) | Service execution system and service execution method | |
CN117592102A (en) | Service execution method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140402 |
|
RJ01 | Rejection of invention patent application after publication |