CN104462145A - Statement generation method and device - Google Patents

Statement generation method and device Download PDF

Info

Publication number
CN104462145A
CN104462145A CN201310440040.3A CN201310440040A CN104462145A CN 104462145 A CN104462145 A CN 104462145A CN 201310440040 A CN201310440040 A CN 201310440040A CN 104462145 A CN104462145 A CN 104462145A
Authority
CN
China
Prior art keywords
data message
statement
format
words
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310440040.3A
Other languages
Chinese (zh)
Other versions
CN104462145B (en
Inventor
董振华
欧阳靖民
张弓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201310440040.3A priority Critical patent/CN104462145B/en
Publication of CN104462145A publication Critical patent/CN104462145A/en
Application granted granted Critical
Publication of CN104462145B publication Critical patent/CN104462145B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Abstract

The invention discloses a statement generation method and device. The method includes the steps of collecting at least one piece of data information of a terminal, wherein the data information includes at least one of the running information of the terminal, the operation information of the terminal and the information received by the terminal from an external interface; determining the sentence constituent of each piece of data information in a to-be-constituted statement; constituting the statement through the data information according to the determined sentence constituents in the to-be-constituted statement. The invention further discloses a corresponding device. Due to the adoption of the technical scheme of the statement generation method and device, statements can be automatically generated according to various types of data information of the terminal, activities or events happening on the terminal can be integrally described through the statements, and users can conveniently and automatically record the activities or the events through the terminal.

Description

A kind of sentence generation method and device
Technical field
The present invention relates to language technology field, be specifically related to a kind of sentence generation method and device.
Background technology
Automatic diary on intelligent terminal can save the event cost of people's recording events, and from the context that multiple dimension and visual angle recording events occur, can recurring events objectively, meanwhile, the generation popularized as automatic diary of intelligent terminal provides available information sources and the data basis of multiple dimension.But, the generation method of a kind of automatic diary of the prior art, its source data mainly text data, as blog information, social network information, short message, associated person information etc., feature is extracted from above text message, generate diary, but when source data lack text describe time, cannot diary be generated; The generation method of the automatic diary of another kind of the prior art analyzes mobile phone service condition and sensing data, in conjunction with the corresponding relation of mobile phone operation event (as switching on and shutting down, receiving and dispatching mail etc.) with User Activity, identify User Activity or event, the event in one day of last tissue in chronological order, generate diary, the diary that the method generates, its content is very simple, form is " time: event " sequence, quantity of information is deficient, and do not use complete statement to describe User Activity or event, readable poor.
In sum, how according to the automatic generated statement of various data messages of terminal, be intactly described in statement activity that terminal occurs or event has become industry problem in the urgent need to address.
Summary of the invention
In view of this, the invention provides a kind of sentence generation method and device, to the automatic generated statement of various data messages according to terminal, be intactly described in activity or event that terminal occurs with statement.
First aspect, provides a kind of sentence generation method, comprising:
At least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives;
Determine that each data message of at least one data message described is being waited to form the sentence element in statement;
Waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described is being formed statement.
In the implementation that the first is possible, at least one data message of described collection terminal, comprising:
At least one data message of acquisition terminal;
Detect the source of at least one data message described;
According to the source of at least one data message described, according to described corresponding form of originating, at least one data message described is formatd, obtain at least one format after data message;
Describedly determine that each data message of at least one data message described is being waited to form the sentence element in statement, comprising:
For the data message after each described format, from database, search at least one words of description of mating with the data message after described format;
According at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
In conjunction with the first possible implementation of first aspect, in the implementation that the second is possible, at least one words of description that described basis is mated with the data message after described format, determine that each data message after each described format is after waiting to form the sentence element in statement, and describedly waiting to form the sentence element in statement according at least one data message described in determining, before described at least one data message composition statement, described method also comprises:
For the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
In conjunction with the implementation that the second of first aspect is possible, in the implementation that the third is possible, describedly waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described formed statement, comprising:
According at least one data message described in determining in the type waiting the sentence element formed in statement, from syntax structural library, select the sentence structure of the type of the sentence element comprising at least one data message described;
According to the position of sentence element in described sentence structure of at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
In conjunction with the implementation that the second of first aspect is possible, in the 4th kind of possible implementation, describedly waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described formed statement, comprising:
Waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse;
Obtain the statement after described coupling.
Second aspect, provides a kind of statement generating apparatus, comprising:
Collector unit, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives;
Determining unit, for determining that each data message of at least one data message described is being waited to form the sentence element in statement;
Component units, for waiting to form the sentence element in statement according at least one data message described in determining, is forming statement by least one data message described.
In the implementation that the first is possible, described collector unit comprises:
Gather subelement, at least one data message of acquisition terminal;
Detection sub-unit, for detecting the source of at least one data message described;
Format subelement, for the source according at least one data message described, according to described corresponding form of originate, at least one data message described is formatd, obtain at least one format after data message;
Described determining unit comprises:
Search subelement, for for the data message after each described format, from database, search at least one words of description of mating with the data message after described format;
Determine subelement, for according at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
In conjunction with the first possible implementation of second aspect, in the implementation that the second is possible, described device also comprises:
Selection unit, for for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
In conjunction with the implementation that the second of second aspect is possible, in the implementation that the third is possible, described component units comprises:
Chooser unit, for the type of the sentence element according at least one data message described, selects the sentence structure of the type of the sentence element comprising at least one data message described from syntax structural library;
Composition subelement, for the position of sentence element in described sentence structure according at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
In conjunction with the implementation that the second of second aspect is possible, in the 4th kind of possible implementation, described component units comprises:
Coupling subelement, for waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse;
Obtain subelement, for obtaining the statement after described coupling.
Adopt the technical scheme of a kind of sentence generation method of the present invention and device, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the process flow diagram of an embodiment of a kind of sentence generation method of the present invention;
Fig. 2 is the process flow diagram of another embodiment of the further refinement to a kind of sentence generation method of the present invention shown in Fig. 1;
Fig. 3 is the process flow diagram of another embodiment of the further refinement to a kind of sentence generation method of the present invention shown in Fig. 1;
Fig. 4 is the structural representation of an embodiment of a kind of statement generating apparatus of the present invention;
Fig. 5 is the structural representation of another embodiment of the further refinement to a kind of statement of the present invention shown in Fig. 4 generating apparatus;
Fig. 6 is the structural representation of another embodiment of the further refinement to a kind of statement of the present invention shown in Fig. 4 generating apparatus.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the process flow diagram of an embodiment of a kind of sentence generation method of the present invention.As shown in Figure 1, the method comprises the following steps:
Step S101, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives.
Terminal of the present invention refers to that network contacts to realize the various equipment of network application with final user, such as notebook computer, panel computer, mobile phone etc.Various data message can be collected a terminal, comprise: the operation information of this terminal self, such as network connection information, system process information etc.; The user's operation information of this terminal, such as sensor information, microblogging etc.; The information that this terminal receives from external interface, such as call-information, short message, GPS information etc., these data messages comprise text data, such as microblogging, note, can from these information extracting directly Word message; Also comprise non-text data, such as network connection information, system process information, sensor information etc., these are the data messages gathered by interface etc.The present invention can unify these data messages of collection terminal and arrange.
Step S102, determines that each data message of at least one data message described is being waited to form the sentence element in statement.
For each data message collected, be defined as corresponding sentence element, the type of sentence element comprises subject, predicate, object, attribute, complement, the adverbial modifier, predicative etc., such as the temporal information of collection terminal, this temporal information can be defined as time adverbial, for the information of collecting from GPS, this information can be identified as point adverbial etc.
Step S103, is waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described is being formed statement.
After identifying the sentence element of each data message of collection, just can according to sentence element corresponding to these data messages, according to certain sentence structure or mate according to language model, obtain the statement of these data messages composition, thus by one or more statement, complete description has been carried out to the content that these data messages comprise.Namely the statement of accumulation defines automatic diary text.
According to a kind of sentence generation method that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Fig. 2 is the process flow diagram of another embodiment of the further refinement to a kind of sentence generation method of the present invention shown in Fig. 1.As shown in Figure 2, the method comprises the following steps:
Step S201, at least one data message of acquisition terminal.
Terminal of the present invention refers to that network contacts to realize the various equipment of network application with final user, such as notebook computer, panel computer, mobile phone etc.Various data message can be collected a terminal, comprise: the operation information of this terminal self, such as network connection information, system process information etc.; The user's operation information of this terminal, such as sensor information, microblogging etc.; The information that this terminal receives from external interface, such as call-information, short message, GPS information etc., these data messages comprise text data, such as microblogging, note, can from these information extracting directly Word message; Also comprise non-text data, such as network connection information, system process information, sensor information etc., these are the data messages gathered by interface etc.
Step S202, detects the source of at least one data message described.
Detect the source of these data messages collected, namely these originate: if this information is GPS information, then this source is the GPS in terminal; If sensor information, then this source is certain sensor in terminal; If call-information, application program (Application, APP) information such as microblogging etc., then can originate according to software program recognizer.
Step S203, according to the source of at least one data message described, according to described corresponding form of originating, at least one data message described is formatd, obtain at least one format after data message.
To the data message gathered from separate sources, need carry out arranging these data messages according to different forms, so that follow-up use.
Such as:
1, micro-blog information: the microblogging issued for a certain moment user, can be expressed as after every bar microblogging format: < time, content of microblog, user ID > tlv triple.
2, GPS information: for the positional information in a certain moment, can be expressed as after every bar GPS information format:
The < time, longitude, dimension, height > four-tuple.
3, acceleration information: for the acceleration information in a certain moment, can be expressed as after every bar acceleration information format:
The < time, x-axis acceleration, y-axis acceleration, z-axis acceleration > four-tuple.
4, call-information: for call, the information service conditions such as note, specifically comprise:
Call: call start time, the end of conversation time, the duration of call, caller, called, the phone miss time.
Note: note time of reception, receives note length, the short message sending time, sends note length.
Can be expressed as after every bar call-information format:
The < time, this mobile phone state, the other side's mobile phone state, this mobile phone arranges state, the other side's mobile phone ID> five-tuple
Such as the machine is received incoming call and can be expressed as:
The < time, receive incoming call, calling, mobile phone jingle bell, the other side's mobile phone ID>
Can have multiple to the form that the data message collected formats, above example only lists the representation based on tuple, and the present invention is including but not limited to above example.
Step S204, for the data message after each described format, searches at least one words of description of mating with the data message after described format from database.
The statement of generation is read for the ease of user, descriptive language that is conventional or user habit need be adopted to be described to the data message collected, and in database, store the one or more words of description corresponding with the data message after each format, therefore, for the data message of each format, at least one words of description of mating with the data message of each format can be searched from this database.
Such as:
1, the temporal information collected is 6:50AM, and the words of description set found is:
Morning, and early morning, morning six 50,6:50AM Beijing time, early in the morning }.
2, the GPS information collected be the words of description set found is for longitude=22.04, dimension=114.3}:
{ Shenzhen Huawei base, Longgang District sakata, Wuhe Avenue }
3, the message registration information < time collected, this mobile phone state, the other side's mobile phone state, this mobile phone arranges state, the other side's mobile phone ID>, and the words of description set for call action is: { call, makes a phone call, answer the call }; Words of description set for conversation object is: and I, John (contact person) }.
4, for the acceleration information < time collected, x-axis acceleration, y-axis acceleration, z-axis acceleration >, words of description set can be:
{ walk, take a walk, jog }.
Step S205, according at least one words of description of mating with the data message after described format, determines that the data message after each described format is being waited to form the sentence element in statement.
The each data message collected is carried out format and words of description coupling after, system basis can be determined the probability of the sentence element of this words of description or is defined as corresponding sentence element according to use habit before to these words of description, the type of sentence element comprises subject, predicate, object, attribute, complement, the adverbial modifier, predicative etc., such as the temporal information of collection terminal, this temporal information can be defined as time adverbial, for the information of collecting from GPS, this information can be identified as point adverbial etc.
Step S206, for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
Before generated statement, the words of description adopted for each sentence element generally only selects one, therefore, need to select a words of description in multiple words of description of mating with the data message after format, the foundation of this selection can be the probability that these words of description use in a database, namely be selected to the probability of generated statement, or also can be accustomed to based on user language.
Step S207, according at least one data message described in determining in the type waiting the sentence element formed in statement, selects the sentence structure of the type of the sentence element comprising at least one data message described from syntax structural library.
In syntactic structure storehouse, store various sentence structure, contain one or more sentence element in often kind of sentence structure, each sentence element has corresponding position in this sentence structure.The sentence structure comprising sentence element corresponding to all data messages collected is selected from syntax structural library.
Such as, the syntactic structure comprised in syntactic structure storehouse has:
[time adverbial] [subject] [point adverbial] [predicate] [object];
[subject] [predicate] [object]; Deng.
Step S208, according to the position of sentence element in described sentence structure of at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
After have selected sentence structure, according to the position of sentence element in this sentence structure that the data message after each format is corresponding, the words of description of mating with this data message selected is filled into this position, has filled the position of each sentence element one by one, namely constituted a statement.
Such as, according to citing above, following statement can be formed:
" early morning, I and John call.”
" in morning, I takes a walk in Wuhe Avenue, and John phones me.”
According to a kind of sentence generation method that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Fig. 3 is the process flow diagram of another embodiment of the further refinement to a kind of sentence generation method of the present invention shown in Fig. 1.As shown in Figure 3, the method comprises the following steps:
Step S301, at least one data message of acquisition terminal.
Step S302, detects the source of at least one data message described.
Step S303, according to the source of at least one data message described, according to described corresponding form of originating, at least one data message described is formatd, obtain at least one format after data message.
Step S304, for the data message after each described format, searches at least one words of description of mating with the data message after described format from database.
Step S305, according at least one words of description of mating with the data message after described format, determines that the data message after each described format is being waited to form the sentence element in statement.
Step S306, for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
Step S307, is waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse.
Step S308, obtains the statement after described coupling.
The present embodiment is from the difference of above-described embodiment: step S207 and the step S208 of step S307 and step S308 and above-described embodiment are different.
The definition of language model is that " language model is configured to the probability distribution P (s) of character string s usually, and P (s) attempts the probability that reflection character string s occurs as a sentence here.”
In n gram language model, sentence s=W1, W2 ... Wn, its probability calculation formula can be expressed as:
P(s)=P(W1)P(W2|W1)P(W3|W1W2)…P(Wn|W1…Wn-1)
In the present embodiment, in statement model storehouse, store various statement, will the mating with the statement in statement model storehouse with the data message words of description of mate of format of generated statement, acquisition mate after statement.
Particularly, such as, store statement 1 in statement model storehouse: " morning, Lyn phoned me ", then think that the want words of description of generated statement and the sentence element in above citing can mate with this statement 1, then the statement obtained after coupling is " morning, John phoned me ".
Statement 2 " early morning; I and Lily call " may be also stored in statement model storehouse, then think that the want words of description of generated statement and sentence element in above citing also can mate with this statement 2, but the probability that the statement 1 be made up of the words of description of statement 1 occurs in the diary text generated is 54%, and the probability that the statement 2 be made up of the words of description of statement 2 occurs in the diary text generated is 30%, then select the statement 1 the highest with the probability occurred in the diary text generated to mate, obtain the statement after coupling.
According to a kind of sentence generation method that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Fig. 4 is the structural representation of an embodiment of a kind of statement generating apparatus of the present invention.As shown in Figure 4, this device 1000 comprises:
Collector unit 11, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives.
Terminal of the present invention refers to that network contacts to realize the various equipment of network application with final user, such as notebook computer, panel computer, mobile phone etc.Various data message can be collected a terminal, comprise: the operation information of this terminal self, such as network connection information, system process information etc.; The user's operation information of this terminal, such as sensor information, microblogging etc.; The information that this terminal receives from external interface, such as call-information, short message, GPS information etc., these data messages comprise text data, such as microblogging, note, can from these information extracting directly Word message; Also comprise non-text data, such as network connection information, system process information, sensor information etc., these are the data messages gathered by interface etc.Collector unit 11 of the present invention can be unified these data messages of collection terminal and arrange.
Determining unit 12, for determining that each data message of at least one data message described is being waited to form the sentence element in statement.
For each data message collected, determining unit 12 is defined as corresponding sentence element, the type of sentence element comprises subject, predicate, object, attribute, complement, the adverbial modifier, predicative etc., such as the temporal information of collection terminal, this temporal information can be defined as time adverbial, for the information of collecting from GPS, this information can be defined as point adverbial etc.
Component units 13, for waiting to form the sentence element in statement according at least one data message described in determining, is forming statement by least one data message described.
After identifying the sentence element of each data message of collection, component units 13 just can according to sentence element corresponding to these data messages, mate according to certain sentence structure or with some language models, obtain the statement of these data messages composition, thus by one or more statement, complete description has been carried out to the content that these data messages comprise.Namely the statement of accumulation defines automatic diary text.
According to a kind of statement generating apparatus that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Fig. 5 is the structural representation of another embodiment of the further refinement to a kind of statement of the present invention shown in Fig. 4 generating apparatus.As shown in Figure 5, this device 2000 comprises:
Collector unit 21, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives.
In the present embodiment, collector unit 21 comprises collection subelement 211, detection sub-unit 212 and format subelement 213.
Gather subelement 211, at least one data message of acquisition terminal.
Terminal of the present invention refers to that network contacts to realize the various equipment of network application with final user, such as notebook computer, panel computer, mobile phone etc.Gather subelement 211 and can collect various data message a terminal, comprising: the operation information of this terminal self, such as network connection information, system process information etc.; The user's operation information of this terminal, such as sensor information, microblogging etc.; The information that this terminal receives from external interface, such as call-information, short message, GPS information etc., these data messages comprise text data, such as microblogging, note, can from these information extracting directly Word message; Also comprise non-text data, such as network connection information, system process information, sensor information etc., these are the data messages gathered by interface etc.
Detection sub-unit 212, for detecting the source of at least one data message described.
Detection sub-unit 212 detects the source of these data messages collected, and namely these originate: if this information is GPS information, then this source is the GPS in terminal; If sensor information, then this source is certain sensor in terminal; If call-information, application program (Application, APP) information such as microblogging etc., then can originate according to software program recognizer.
Format subelement 213, for the source according at least one data message described, according to described corresponding form of originate, at least one data message described is formatd, obtain at least one format after data message.
To the data message gathered from separate sources, subelement 213 need be formatd and carry out arranging these data messages according to different forms, so that follow-up use.
Can have multiple to the form that the data message collected formats, the representations such as such as tuple, the present invention is including but not limited to above example.
Determining unit 22, for determining that each data message of at least one data message described is being waited to form the sentence element in statement.
In the present embodiment, determining unit 22 comprises and searches subelement 221 and determine subelement 222.
Search subelement 221, for for the data message after each described format, from database, search at least one words of description of mating with the data message after described format.
The statement of generation is read for the ease of user, descriptive language that is conventional or user habit need be adopted to be described to the data message collected, and in database, store the one or more words of description corresponding with the data message after each format, therefore, for the data message of each format, search subelement 221 and can search at least one words of description of mating with the data message of each format from this database.
Determine subelement 222, for according at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
The each data message collected is carried out format and words of description coupling after, determine that subelement 222 basis can be determined the probability of the sentence element of this words of description or be defined as corresponding sentence element according to use habit before to these words of description, the type of sentence element comprises subject, predicate, object, attribute, complement, the adverbial modifier, predicative etc., such as the temporal information of collection terminal, this temporal information can be defined as time adverbial, for the information of collecting from GPS, this information can be defined as point adverbial etc.
Selection unit 23, for for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
Before generated statement, the words of description adopted for each sentence element generally only selects one, therefore, selection unit 23 needs to select a words of description in multiple words of description of mating with the data message after format, the foundation of this selection can be the probability that these words of description use in a database, namely be selected to the probability of generated statement, or also can be accustomed to based on user language.
Component units 24, for waiting to form the sentence element in statement according at least one data message described in determining, is forming statement by least one data message described.
In the present embodiment, component units 24 comprises chooser unit 241 and composition subelement 242.
Chooser unit 241, for the type of the sentence element according at least one data message described, selects the sentence structure of the type of the sentence element comprising at least one data message described from syntax structural library.
In syntactic structure storehouse, store various sentence structure, contain one or more sentence element in often kind of sentence structure, each sentence element has corresponding position in this sentence structure.Chooser unit 241 selects the sentence structure comprising sentence element corresponding to all data messages collected from syntax structural library.
Composition subelement 242, for the position of sentence element in described sentence structure according at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
After have selected sentence structure, composition subelement 242 is according to the position of sentence element in this sentence structure corresponding to the data message after each format, the words of description of mating with this data message selected is filled into this position, fill the position of each sentence element one by one, namely constitute a statement.
According to a kind of statement generating apparatus that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Fig. 6 is the structural representation of another embodiment of the further refinement to a kind of statement of the present invention shown in Fig. 4 generating apparatus.As shown in Figure 6, this device 3000 comprises:
Collector unit 31, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives.
In the present embodiment, collector unit 31 comprises collection subelement 311, detection sub-unit 312 and format subelement 313.
Gather subelement 311, at least one data message of acquisition terminal.
Detection sub-unit 312, for detecting the source of at least one data message described.
Format subelement 313, for the source according at least one data message described, according to described corresponding form of originate, at least one data message described is formatd, obtain at least one format after data message.
Determining unit 32, for determining that each data message of at least one data message described is being waited to form the sentence element in statement.
In the present embodiment, determining unit 32 comprises and searches subelement 321 and determine subelement 322.
Search subelement 321, for for the data message after each described format, from database, search at least one words of description of mating with the data message after described format.
Determine subelement 322, for according at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
Selection unit 33, for for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
Component units 34, for waiting to form the sentence element in statement according at least one data message described in determining, is forming statement by least one data message described.
In the present embodiment, component units 34 comprises coupling subelement 341 and obtains subelement 342.
Coupling subelement 341, for waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse.
Obtain subelement 342, for obtaining the statement after described coupling.
The present embodiment is from the difference of above-described embodiment: component units 34 is different with the component units 24 of above-described embodiment.
The definition of language model is that " language model is configured to the probability distribution P (s) of character string s usually, and P (s) attempts the probability that reflection character string s occurs as a sentence here.”
In n gram language model, sentence s=W1, W2 ... Wn, its probability calculation formula can be expressed as:
P(s)=P(W1)P(W2|W1)P(W3|W1W2)…P(Wn|W1…Wn-1)
In the present embodiment, store various statement in statement model storehouse, coupling subelement 341 will the mating with the statement in statement model storehouse with the data message words of description of mate of format of generated statement, obtain subelement 342 acquisition mate after statement.
Particularly, such as, store statement 1 in statement model storehouse: " morning, Lyn phoned me ", then think that the want words of description of generated statement and the sentence element in above citing can mate with this statement 1, then the statement obtained after coupling is " morning, John phoned me ".
Statement 2 " early morning; I and Lily call " may be also stored in statement model storehouse, then think that the want words of description of generated statement and sentence element in above citing also can mate with this statement 2, but forming statement 1 probability of occurring in the diary text generated by the words of description of statement 1 is 54%, and to form statement 2 probability of occurring in the diary text generated by the words of description of statement 2 be 30%, then select the statement 1 the highest with the probability occurred in the diary text generated to mate, obtain the statement after coupling.
According to a kind of statement generating apparatus that the embodiment of the present invention provides, can according to the automatic generated statement of various data messages of terminal, intactly be described in activity or event that terminal occurs with statement, facilitate user to carry out automatic record by terminal to these activities or event.
Above disclosedly be only present pre-ferred embodiments, certainly can not limit the interest field of the present invention with this, therefore according to the equivalent variations that the claims in the present invention are done, still belong to the scope that the present invention is contained.

Claims (10)

1. a sentence generation method, is characterized in that, comprising:
At least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the operation information of described terminal and described terminal from least one the information that external interface receives;
Determine that each data message of at least one data message described is being waited to form the sentence element in statement;
Waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described is being formed statement.
2. the method for claim 1, is characterized in that, at least one data message of described collection terminal, comprising:
At least one data message of acquisition terminal;
Detect the source of at least one data message described;
According to the source of at least one data message described, according to described corresponding form of originating, at least one data message described is formatd, obtain at least one format after data message;
Describedly determine that each data message of at least one data message described is being waited to form the sentence element in statement, comprising:
For the data message after each described format, from database, search at least one words of description of mating with the data message after described format;
According at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
3. method as claimed in claim 2, it is characterized in that, at least one words of description that described basis is mated with the data message after described format, determine that the data message after each described format is after waiting to form the sentence element in statement, and describedly waiting to form the sentence element in statement according at least one data message described in determining, before described at least one data message composition statement, also comprise:
For the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
4. method as claimed in claim 3, is characterized in that, is describedly waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described is formed statement, comprising:
According at least one data message described in determining in the type waiting the sentence element formed in statement, from syntax structural library, select the sentence structure of the type of the sentence element comprising at least one data message described;
According to the position of sentence element in described sentence structure of at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
5. method as claimed in claim 3, is characterized in that, is describedly waiting to form the sentence element in statement according at least one data message described in determining, at least one data message described is formed statement, comprising:
Waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse;
Obtain the statement after described coupling.
6. a statement generating apparatus, is characterized in that, comprising:
Collector unit, at least one data message of collection terminal, wherein, described data message comprises the operation information of described terminal, the information of the operation of described terminal and described terminal from least one the information that external interface receives;
Determining unit, for determining that each data message of at least one data message described is being waited to form the sentence element in statement;
Component units, for waiting to form the sentence element in statement according at least one data message described in determining, is forming statement by least one data message described.
7. device as claimed in claim 6, it is characterized in that, described collector unit comprises:
Gather subelement, at least one data message of acquisition terminal;
Detection sub-unit, for detecting the source of at least one data message described;
Format subelement, for the source according at least one data message described, according to described corresponding form of originate, at least one data message described is formatd, obtain at least one format after data message;
Described determining unit comprises:
Search subelement, for for the data message after each described format, from database, search at least one words of description of mating with the data message after described format;
Determine subelement, for according at least one words of description of mating with the data message after described format, determine that the data message after each described format is being waited to form the sentence element in statement.
8. device as claimed in claim 7, is characterized in that, also comprise:
Selection unit, for for the data message after each format, according to the probability that at least one words of description described in mating with the data message after described format uses in the database, from least one words of description described in mating with the data message after described format, select a words of description.
9. device as claimed in claim 8, it is characterized in that, described component units comprises:
Chooser unit, for the type of the sentence element according at least one data message described, selects the sentence structure of the type of the sentence element comprising at least one data message described from syntax structural library;
Composition subelement, for the position of sentence element in described sentence structure according at least one data message described, by select with described at least one format after the data message words of description of mating form statement.
10. device as claimed in claim 8, it is characterized in that, described component units comprises:
Coupling subelement, for waiting to form the sentence element in statement according at least one data message described in determining, by select with described at least one format after the data message words of description of mating mate with the statement in statement model storehouse;
Obtain subelement, for obtaining the statement after described coupling.
CN201310440040.3A 2013-09-24 2013-09-24 A kind of sentence generation method and device Active CN104462145B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310440040.3A CN104462145B (en) 2013-09-24 2013-09-24 A kind of sentence generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310440040.3A CN104462145B (en) 2013-09-24 2013-09-24 A kind of sentence generation method and device

Publications (2)

Publication Number Publication Date
CN104462145A true CN104462145A (en) 2015-03-25
CN104462145B CN104462145B (en) 2018-04-10

Family

ID=52908200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310440040.3A Active CN104462145B (en) 2013-09-24 2013-09-24 A kind of sentence generation method and device

Country Status (1)

Country Link
CN (1) CN104462145B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107484038A (en) * 2017-08-22 2017-12-15 北京奇艺世纪科技有限公司 A kind of generation method of video subject, device and electronic equipment
CN110399499A (en) * 2019-07-18 2019-11-01 珠海格力电器股份有限公司 A kind of corpus generation method, device, electronic equipment and readable storage medium storing program for executing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090234840A1 (en) * 2005-12-26 2009-09-17 Sony Computer Entertainment Inc. Information Processing Method, Information Processing System, And Server
CN103118182A (en) * 2013-01-17 2013-05-22 广东欧珀移动通信有限公司 Method to record application diaries of movable terminal and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090234840A1 (en) * 2005-12-26 2009-09-17 Sony Computer Entertainment Inc. Information Processing Method, Information Processing System, And Server
CN103118182A (en) * 2013-01-17 2013-05-22 广东欧珀移动通信有限公司 Method to record application diaries of movable terminal and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107484038A (en) * 2017-08-22 2017-12-15 北京奇艺世纪科技有限公司 A kind of generation method of video subject, device and electronic equipment
CN110399499A (en) * 2019-07-18 2019-11-01 珠海格力电器股份有限公司 A kind of corpus generation method, device, electronic equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN104462145B (en) 2018-04-10

Similar Documents

Publication Publication Date Title
CN106101747B (en) A kind of barrage content processing method and application server, user terminal
CN103227821B (en) Method and device for processing position data of target user
US10491550B2 (en) Instant communication
CN109492073B (en) Log search method, log search apparatus, and computer-readable storage medium
CN109889426B (en) Information processing method, device and system based on instant messaging
CN113268498A (en) Service recommendation method and device with intelligent assistant
CN103377276A (en) Method for offering suggestion during conversation and electronic device using the same
CN107872494B (en) Message pushing method and device
RU2012103186A (en) DETERMINING USER-SPECIFIC SEMANTICS OF LOCATION FROM USER DATA
CN107341033A (en) A kind of data statistical approach, device, electronic equipment and storage medium
JP2012526314A (en) System and method for analyzing behavioral and contextual data
CN104850550A (en) Method and apparatus for ordering prompt messages
JP6162009B2 (en) Server apparatus, program, system and method for providing information according to user data input
JP2013542679A (en) Map telephone directory generating method, electronic map and mobile terminal thereof
CN102055826A (en) Method and system for maintaining multi-dimensional related information related to contacts in address list
CN102326444A (en) Method and terminal for travel assistance
CN109219953B (en) Alarm clock reminding method, electronic equipment and computer readable storage medium
CN105072238A (en) Method and apparatus for creating contact list according to note information of newly-added number
CN108427761B (en) News event processing method, terminal, server and storage medium
CN110896376B (en) Message reminding method, message sending method, related device and equipment
CN103249034A (en) Method and device for acquiring contact information
CN104144250A (en) Method and device for prompting events on terminal
EP2908562B1 (en) Address book information service system, and method and device for address book information service therein
CN103235677A (en) Method and device for quickly inputting communication information in terminal
US11003667B1 (en) Contextual information for a displayed resource

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant