CN104951508A - Time information identification method and device - Google Patents

Time information identification method and device Download PDF

Info

Publication number
CN104951508A
CN104951508A CN201510263225.0A CN201510263225A CN104951508A CN 104951508 A CN104951508 A CN 104951508A CN 201510263225 A CN201510263225 A CN 201510263225A CN 104951508 A CN104951508 A CN 104951508A
Authority
CN
China
Prior art keywords
character
time
character string
absolute
string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510263225.0A
Other languages
Chinese (zh)
Other versions
CN104951508B (en
Inventor
王涛
易薇
李斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201510263225.0A priority Critical patent/CN104951508B/en
Publication of CN104951508A publication Critical patent/CN104951508A/en
Application granted granted Critical
Publication of CN104951508B publication Critical patent/CN104951508B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Abstract

The invention relates to a time information identification method and device. The method comprises the steps that a first character sequence is acquired; according to the mapping relation between preset time correlation characters and preset standard characters, the preset time correlation characters in the first character sequence are mapped into the corresponding preset standard characters in sequence, and a second character sequence is acquired; according to the mapping relation between preset relative time words and characteristic character strings with a preset characteristic format, preset relative time words in the second character sequence are mapped into the corresponding characteristic character strings in sequence, and a third character sequence is acquired; character strings matched with a preset absolute time format and character strings matched with the preset characteristic format are searched for in the third character sequence respectively; time information is determined according to the matched character strings. Through the time information identification method and device, the time information can be identified from natural linguistic words, various operations can be conducted according to the identified time information, and the method and device can be applied to various kinds of scenes where the time information is needed.

Description

Temporal information recognition methods and device
Technical field
The present invention relates to technical field of information processing, particularly relate to a kind of temporal information recognition methods and device.
Background technology
At present, the SDK (Software Development Kit, SDK (Software Development Kit)) that Android (Android) operating system provides utilizes TextView (text view control) to carry out text exhibition content usually.The android:autoLink attribute arranging TextView can realize automatic mark link text, and user clicks on links text can realize specific action.
But the automatic mark of current Android operation system and other operating system only supporting telephone number, addresses of items of mail and network address, can not identify date, moment equal time information, the temporal information with literal expression especially can not be identified.And these texts of telephone number, addresses of items of mail and network address have very strict form; be easy to just can realize identifying and being labeled as link text; and the expression of time is complicated; usually literal expression can be comprised; such as evening, today, Tomorrow, Next week etc., be difficult to be applied in the identification of temporal information to the identification of telephone number, addresses of items of mail and network address.
Such as, certain instant messaging application at present provides the function of automatic mark link text, is realized, the identification of supporting telephone number and addresses of items of mail by Linkify function.As shown in Figure 1, user receives message by the application of this instant messaging, content is " notice: tomorrow morning 9 meeting; Thu Jan 15; 20159am-9:30am (name@xxx.com) ", " 20159 " are wherein identified as telephone number, " name@xxx.com " is identified as addresses of items of mail, be shown as the form in Fig. 1 after marking link text.Visible, owing to not supporting the identification of temporal information, a part for temporal information is identified as telephone number here by mistake, and owing to comprising literal expression " am " in temporal information, represents the morning, bring difficulty also to the identification of temporal information.
Summary of the invention
Based on this, be necessary to realize the problem to effective identification of temporal information for being difficult at present, a kind of temporal information recognition methods and device are provided.
A kind of temporal information recognition methods, described method comprises:
Obtain the first character string;
According to the mapping relations of Preset Time relevant character and preset standard character, in order the Preset Time relevant character in described first character string is mapped as corresponding preset standard character, obtains the second character string;
According to default relative time word and the mapping relations of feature string with default characteristic format, in order the default relative time word in described second character string is mapped as corresponding feature string, obtains three-character doctrine sequence;
The character string of mating with default absolute time format and default characteristic format is respectively searched in order in described three-character doctrine sequence;
According to the character string determination temporal information of coupling.
A kind of temporal information recognition device, described device comprises:
First character string acquisition module, for obtaining the first character string;
First mapping block, for the mapping relations according to Preset Time relevant character and preset standard character, is mapped as corresponding preset standard character by the Preset Time relevant character in described first character string in order, obtains the second character string;
Second mapping block, for according to presetting relative time word and the mapping relations of feature string with default characteristic format, being mapped as corresponding feature string by the default relative time word in described second character string in order, obtaining three-character doctrine sequence;
Matched and searched module, for searching the character string of mating with default absolute time format and default characteristic format respectively in order in described three-character doctrine sequence;
Temporal information determination module, for the character string determination temporal information according to coupling.
Above-mentioned temporal information recognition methods and device, first the Preset Time relevant character in the first character string is mapped as corresponding preset standard character to obtain the second character string, the multiple approximate expression unification preset standard character of time can be expressed, to facilitate follow-up identification like this.Then the default relative time word in the second character string is mapped as the feature string of corresponding default characteristic format to obtain three-character doctrine sequence.Temporal information in such three-character doctrine sequence all transforms the data in order to format, and by format match mode, searches the character string of mating with default absolute time format and default characteristic format, and then utilizes the character string of coupling just can determine temporal information.Can realize like this identifying temporal information from natural language word, and then various operation can be carried out according to the temporal information identified, be applied to various needs in the scene of temporal information.
Accompanying drawing explanation
Fig. 1 applies the functional schematic of the label link text provided for instant messaging traditional in an embodiment;
Fig. 2 is for realizing the structural representation of the terminal of temporal information recognition methods in an embodiment;
Fig. 3 is the schematic flow sheet of temporal information recognition methods in an embodiment;
Fig. 4 is the data structure schematic diagram of the first dictionary tree in an embodiment;
Fig. 5 is the data structure schematic diagram of the second dictionary tree in an embodiment;
Fig. 6 is the schematic flow sheet generating the step of link text in an embodiment;
Fig. 7 determines in an embodiment that temporal information corresponds to the schematic flow sheet of the step of time starting position in the first character string and time end position;
Fig. 8 be in an embodiment according to the character types of character each in the first character string and corresponding priority, determine the schematic flow sheet of the step of time starting position in the first character string and time end position;
Fig. 9 is the schematic flow sheet according to the step of temporal information configuration Preset Time related application in an embodiment;
Figure 10 is that temporal information that in an embodiment, terminal is received in many notes by phone book applications is identified and be labeled as the interface schematic diagram of link text;
Figure 11 is the interface schematic diagram entering actions menu in an embodiment according to trigger action;
Figure 12 is the schematic diagram of the configuration page of backlog planning application in an embodiment;
Figure 13 is the interface schematic diagram of the terminal demonstration do list that setting completed in an embodiment;
Figure 14 is the structured flowchart of temporal information recognition device in an embodiment;
Figure 15 is the structured flowchart of temporal information recognition device in another embodiment;
Figure 16 is the structured flowchart of position determination module in an embodiment;
Figure 17 is the structured flowchart of execution module in an embodiment;
Figure 18 is the structured flowchart of temporal information recognition device in another embodiment;
Figure 19 is the schematic flow sheet being applied to the temporal information recognition methods in Android operation system in an embodiment.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
There is multiple statement in the time of Chinese, by arranging, the statement of known Chinese time being classified, and providing combination condition for identification, as shown in table 1 below.A kind of method and apparatus of recognition time information is provided based on this.
In table 1, the date refers to the time using one day as least unit, and the moment then refers to the intraday time.Time classification comprises relative-date, absolute date, relative instant and absolute moment.Relative-date and relative instant are called relative time, and relative time is used for representing certain time range.Absolute date and absolute moment are called absolute time, and absolute time is used for representing certain concrete time.Relative time can modify absolute time, makes the time of expression more accurate.
Combination condition for identification in table 1 refers to that the ageing of different time classification gets up to carry out the condition identified, nonrecognition when such as relative-date occurs separately, occur separately here referring to adjacent with this relative-date before or all there is no the character of other and time correlation below.Here above or be the direction determined according to reading order below.Text in table 1 refers to the concrete time word under corresponding time classification, X, yy, mm, dd etc. in text are generic representation symbols, this generic representation symbol is used for representing the universal expression form that has of corresponding multiple time word, there is the span of this generic representation symbol in the corresponding remarks of table 1.Give tacit consent to corresponding initial time and refer to the time Fixed Initial Point that corresponding time word acquiescence is corresponding.
Table 1:
As shown in Figure 2, in one embodiment, provide a kind of terminal 200, this terminal 200 comprises the processor, internal storage, non-volatile memory medium, network interface, display device and the input media that are connected by system bus.The processor of this terminal 200 has the ability of calculating and control terminal 200 operation, and processor is configured to perform a kind of temporal information recognition methods.The non-volatile memory medium of this terminal 200 comprises a kind of temporal information recognition device, has the function realizing a kind of temporal information recognition methods.The display device of this terminal 200 can be LCDs or electric ink display screen.Input media comprises touch pad, physical button, trace ball, mouse and for forming at least one in the touch layer of touch screen with display device.Terminal 200 comprises desk-top computer and mobile terminal, and mobile terminal comprises mobile phone, panel computer and personal digital assistant (PDA).
As shown in Figure 3, in one embodiment, provide a kind of temporal information recognition methods, the terminal that the present embodiment is applied in above-mentioned Fig. 2 in this way illustrates.The method specifically comprises the steps:
Step 302, obtains the first character string.
Character string refers to the data that the character that multiple existence puts in order forms successively.Terminal 200 can run instant messaging application, and can obtain text in the message received in instant communications applications as the first character string, the note that also mobile phone can be received is as the first character string.Terminal 200 can also receive selection instruction, selects character string as the first character string according to this selection instruction.The passage shown in the browser that such as user can select terminal 200 to run or word processor, thus terminal 200 using the word selected as the first character string.First character string is the pending character string of input.
Illustrate, terminal 200 receives note on January 16th, 2015, and content is for " thirty a whole night eight next week holds department's annual meeting to ten one." terminal 200 obtains the content of note as the first character string, then the first character string is for " thirty a whole night eight next week holds department's annual meeting to ten one.”
Step 304, according to the mapping relations of Preset Time relevant character and preset standard character, is mapped as corresponding preset standard character by the Preset Time relevant character in the first character string in order, obtains the second character string.
Preset Time relevant character refers to that preset with character that is time correlation, preset standard character is then the uniform character being used for expression time preset, and the mapping relations of Preset Time relevant character and preset standard character can be many-to-one mapping relations.Refer to putting in order according to character in the first character string in order.
In one embodiment, the mapping relations of Preset Time relevant character and preset standard character comprise the mapping relations shown in table 2.Sequence number in table 2 is used for distinguishing different mapping relations, does not limit order.Content in table 2 can also carry out expanding or changing as required.
Table 2:
Sequence number Preset Time Preset standard character Sequence number Preset Time Preset standard character
relevant character relevant character
1 time : 18 . -
2 Three 3 19 . -
3 Point : 20 Next Under
4 O'clock Point 21 Next Under
5 Ten 10 22 Next week Next week
6 Point :: 23 : :
7 Month - 24 Tomorrow morning Tomorrow morning
8 Seven 7 25 Tomorrow morning Tomorrow morning
9 Eight 8 26 / -
10 Five 5 27 Two 2
11 Year - 28 Extremely ~
12 Week Week 29 Or ~
13 Four 4 30 Two 2
14 Number # 31 Zero 0
15 Nine 9 32 Six 6
16 One 1 33 Day #
17 Arrive ~
In one embodiment, dictionary tree can be adopted to represent the mapping relations of Preset Time relevant character and preset standard character.Wherein dictionary tree is set also known as word lookup, is a kind of tree structure, and utilize the common prefix of character string to reduce query time, reduce meaningless character string comparison to greatest extent, search efficiency is high.
In one embodiment, adopt the first dictionary tree to represent that the mapping relations of Preset Time relevant character and preset standard character are specifically expressed as follows:
In mapping relations represented by above-mentioned first dictionary tree, " | INTERNAL " represent the node in the first dictionary tree with child node, " | AVAILABLE " represent belonging to a child node under node, indentation represents node level, symbol "-> " represents mapping relations, before the character that symbol "-> " is front, the character reconnected from root node to previous node is Preset Time relevant character, is preset standard character after symbol "-> ".
Illustrate, what " INTERNAL:# " represented is root node, and " | INTERNAL " of follow-up all indentations or " | AVAILABLE " is all the child node of root node." | INTERNAL: point->: " represent that this node exists child node, and there is mapping in this node, " point " is mapped to ": ".The child node of " | AVAILABLE: clock-> point " expression " point " node of indentation represents " o'clock ", and this child node exists and maps, and will be mapped to ": " at " o'clock ".The data structure of the first dictionary tree can be expressed as shown in Figure 4.
Terminal 200 utilizes above-mentioned table 2 or the first dictionary tree, easily the Preset Time relevant character in the first character string can be mapped as corresponding preset standard character, obtain the second character string.Second character string is the character string after approximate expression normalization.
Illustrate, the first character string is for " thirty a whole night eight next week holds department's annual meeting to ten one." according to above-mentioned table 2, be mapped as " next week " " next week " in the first character string, " one " is mapped as " 1 "; " eight " are mapped as " 8 ", " point " is mapped as ": ", " to " be mapped as " ~ "; " 11 " are mapped as " 11 ", and " year " is mapped as "-", "." be mapped as "-".So obtain the second character string for " evening 8 next week 1: half ~ 11: hold department-meeting-".
Step 306, according to default relative time word and the mapping relations of feature string with default characteristic format, is mapped as corresponding feature string by the default relative time word in the second character string in order, obtains three-character doctrine sequence.
Default relative time word refers to the word for expressing relative time, is not limited to word, also can comprises numeral, or designated symbols, such as " week 1 ", " all # ".Feature string refers to that each feature string has default characteristic format, can identify feature string easily according to default characteristic format for representing the corresponding eigenwert presetting relative time word.Default characteristic format can be such as " %**** $ ", and " * " wherein represents numeral, and such form is that the character string of " %**** $ " can be found out easily by regular expression.Default characteristic format can also be " %****$ ", “ $****% " or " $ * * * * % " etc.Presetting relative time word and the mapping relations of feature string with default characteristic format is relation one to one.
In one embodiment, preset relative time word and comprise the mapping relations shown in table 3 with the mapping relations of the feature string with default characteristic format.Sequence number in table 3 is used for distinguishing different mapping relations, does not limit order.Content in table 3 can also carry out expanding or changing as required.
Table 3:
Sequence number Preset relative time word Feature string Sequence number Preset relative time word Feature string
1 Noon %4012$ 18 Week 4 %2004$
2 Midnight %4124$ 19 Week 3 %2003$
3 The day after tomorrow %1003$ 20 Week 2 %2002$
4 Next year %3012$ 21 Evening %4020$
5 My god %1001$ 22 Under- %3001$
6 At dusk %4018$ 23 Afternoon %4014$
7 The morning %4010$ 24 Next week 1 %2008$
8 Morning %4006$ 25 Next week 6 %2013$
9 The day after tomorrow %1002$ 26 Next week sky %2014$
10 Morning %4008$ 27 Lower weekend %2013$
11 High noon %4112$ 28 Next week # %2014$
12 Week 1 %2001$ 29 Next week 5 %2012$
13 Week 6 %2006$ 30 Next week 4 %2011$
14 Zhou Tian %2007$ 31 Next week 3 %2010$
15 Weekend %2006$ 32 Next week 2 %2009$
16 Week # %2007$ 33 Come off duty %4018$
17 Week 5 %2005$ 34 Next month %3001$
In one embodiment, the second dictionary tree can be adopted to represent default relative time word and the mapping relations of feature string with default characteristic format.The data structure of the second dictionary tree specifically can be expressed as follows:
In mapping relations represented by above-mentioned second dictionary tree, " | INTERNAL " represent the node in the second dictionary tree with child node, " | AVAILABLE " represent belonging to a child node under node, indentation represents node level, symbol "-> " represents mapping relations, the character reconnected before the character that symbol "-> " is front from root node to previous node is default relative time word, is feature string after symbol "-> ".The data structure of the second dictionary tree can be expressed as shown in Figure 5.
Terminal 200 utilizes above-mentioned table 3 or Fig. 5, easily the default relative time word in the second character string can be mapped as corresponding feature string, obtain three-character doctrine sequence.Three-character doctrine sequence is the character string after characteristic value normalization.
Illustrate, second character string is for " evening 8 next week 1: half ~ 11: hold department-meeting-" is according to above-mentioned table 3, " next week 1 " in second character string is mapped as " %2008 $ ", " evening " is mapped as " %4020 $ ", then obtain three-character doctrine sequence for " %2008 $ %4020 $ 8: half ~ 11: hold department-meeting-".
Step 308, searches the character string of mating with default absolute time format and default characteristic format respectively in order in three-character doctrine sequence.
Here terminal 200 adopts format match mode, specifically respectively using default absolute time format and default characteristic format as template, searches the character string with corresponding format match in order from three-character doctrine sequence.Refer to putting in order according to character in three-character doctrine sequence in order.Absolute time can determine separately concrete time value, and absolute time information then refers to the character for representing corresponding absolute time.Relative time can not determine occurrence separately, but can combine to determine concrete time value with absolute time, and relative time information then refers to the character for representing corresponding relative time.
Default absolute time format is based on the character through mapping, default absolute time format comprises " * # ", " *-* # " (corresponding * month * day), " *-*-* # " (corresponding * * month * day), " *: " (during corresponding *) and " *: *:: " (during corresponding *, * divides), the wherein numeral of " * " all representative digit or written form.Format match can be realized by regular expression.
Illustrate, three-character doctrine sequence is " %2008 $ %4020 $ 8: half ~ 11: hold department-meeting-", through presetting the format match of absolute time format, matching character string " 8: half " and " 11: ", and being connected by the character " ~ " representing time range.Through presetting the format match of characteristic format, match character string " %2008 $ " and " %4020 $ ".
Step 310, according to the character string determination temporal information of coupling.
Particularly, terminal 200 can determine absolute time information according to the character string of mating with default absolute time format, if than matching character string " 8: half " and " 11: ", then can determine that the absolute moment is for " 8:30 " and " 11:00 " according to this.Terminal 200 can determine relative time information according to the character string of mating with default characteristic format, if than matching character string " %2008 " and " %4020 ", then can determine that relative-date is next week one and evening according to this.Therefore can determine that temporal information is 8:30 to 11:00 a whole night next week.
In one embodiment, this temporal information recognition methods also comprises: judge whether temporal information meets temporal information rule, if this temporal information does not meet temporal information rule, then abandons this temporal information.If comprise on February 29th, 2015 than the temporal information identified, because 2015 is not the leap year, therefore on February 29th, identify 2015 is abandoned.Also such as April 31,13 months, 26 equal time information all do not meet temporal information rule, these temporal informations are abandoned.
In one embodiment, this temporal information recognition methods also comprises: judge that whether temporal information is expired, if this temporal information is expired, then abandons this temporal information.The present embodiment is applicable to require higher scene, the setting of such as backlog to timeliness.
Above-mentioned temporal information recognition methods, first the Preset Time relevant character in the first character string is mapped as corresponding preset standard character to obtain the second character string, the multiple approximate expression unification preset standard character of time can be expressed, to facilitate follow-up identification like this.Then the default relative time word in the second character string is mapped as the feature string of corresponding default characteristic format to obtain three-character doctrine sequence.Temporal information in such three-character doctrine sequence all transforms the data in order to format, and by format match mode, searches the character string of mating with default absolute time format and default characteristic format, and then utilizes the character string of coupling just can determine temporal information.Can realize like this identifying temporal information from natural language word, and then various operation can be carried out according to the temporal information identified, be applied to various needs in the scene of temporal information.
In one embodiment, step 310 comprises: obtain absolute time information and relative time information, and carry out time migration process according to relative time information to absolute time information; At least one wherein in absolute time information and relative time information is determined according to the character string of coupling.In the present embodiment, the feature of simulation human brain processing time information, first pay close attention to absolute time information and/or absolute date information, rear use relative time information and/or date and time information are modified absolute time information and/or absolute date information, can go out the temporal information of literal expression by accurate analysis.
In one embodiment, step 310 comprises: acquisition carves the absolute time information corresponding to character string with the absolute time of the default absolute moment format match included by default absolute time format, and carries out time migration process according to the feature string mated with default characteristic format to absolute time information; It is adjacent that the feature string of coupling and the absolute time mated carve character string.
Particularly, default absolute time format comprise default absolute moment form and default absolute date form.Directly can determine corresponding absolute time information according to the character string presetting absolute moment format match, such as " 8: half " and " 11: ", directly can determine that absolute time information is for " 8:30 " and " 11:00 ".
Here two kinds of adjacent character strings, represent that the character string being used for expressing relative time information is to being used for the modification of the character string expressing absolute time information.Terminal 200 carries out time migration process according to the feature string mated with default characteristic format to absolute time information.Particularly, the time span that each feature string can be corresponding respective, thus terminal 200 can carry out time migration process according to the time span of this feature string to absolute time information.If be " %2008 %4020 8: half ~ 11: hold department-meeting-" than three-character doctrine sequence, the absolute moment packets of information identified is drawn together " 8:30 ", feature string " %4020 " before this absolute time information represents evening, corresponding time span can be [20:00,24:00], then " 8:30 " can be offset 12 hours and obtain " 20:30 ".Time span can be determined according to the corresponding initial time of the acquiescence in table 1.
In one embodiment, according to the feature string mated with default characteristic format, absolute time information is carried out to the step of time migration process, specifically comprise: obtain the calculations of offset type code position in the feature string that mates with default characteristic format and calculations of offset parameter, adopt this offset function corresponding to calculations of offset type code position and according to this calculations of offset parameter, time migration process carried out to absolute time information.
Particularly, the feature string presetting characteristic format comprises calculations of offset type code position and calculations of offset parameter.The wherein offset function that adopts when calculations of offset of the corresponding relative time word of calculations of offset type code bit representation, calculations of offset parameter is then calculating parameter required when carrying out migration processing.
Particularly, if presetting characteristic format is " %**** $ ", * * * * represents four integers, from left to right the first bit representation calculations of offset type code position, last two bit representation calculations of offset parameters.If be " %4020 " than the feature string of coupling, then calculations of offset type code position is 4, represent the offset function of employing the 4th type, the time range that calculations of offset Parametric Representation is corresponding is minimum is 20 points, because " 8:30 " be not within the scope of this, then carry out migration processing and obtain " 20:30 ".In one embodiment, the value of calculations of offset type code position and corresponding offset function can be as shown in table 4:
Table 4:
In one embodiment, step 310 comprises: acquisition carves the absolute time information corresponding to character string with the absolute time of the default absolute moment format match included by default absolute time format, and carries out time migration process according to current system time information to absolute time information.Particularly, terminal 200 obtains current system time information according to the system clock of terminal 200, if be 16:45 than present system time information, if definitely time information is " 8:30 ", then " 8:30 " can be offset 12 hours and obtain " 20:30 ".
In one embodiment, step 310 comprises: obtain current system date and time information as absolute date information, and carry out time migration process according to the feature string mated with default characteristic format to absolute date information.Particularly, terminal 200 will receive the date 2015-1-16 of note as absolute date information, and feature string " %2008 $ " the corresponding next Monday of coupling, 2015-1-16 is Friday, then migration processing is carried out to this absolute date information 2015-1-16, obtain 2015-1-19.
In one embodiment, feature string according to mating with default characteristic format specifically comprises the step that absolute date information carries out time migration process: obtain the calculations of offset type code position in the feature string mated with default characteristic format and calculations of offset parameter, adopts this offset function corresponding to calculations of offset type code position and carries out time migration process according to this calculations of offset parameter to absolute date information.
Illustrate, when matching feature string " %2008 $ " (next week 1), code snippet is as follows:
Similarly, next week X, the lower X month, next year etc. relative time information can adopt the mode in the present embodiment: obtain calculations of offset type code position and calculations of offset parameter, adopt this offset function corresponding to calculations of offset type code position and according to this calculations of offset parameter, time migration process carried out to absolute date information.
As shown in Figure 6, in one embodiment, this temporal information recognition methods also comprises the step generating link text, specifically comprises the steps:
Step 602, determines that temporal information corresponds to time starting position in the first character string and time end position.
Particularly, time starting position refers in the first character string the reference position of character being used for representing temporal information, and corresponding time end position then refers in the first character string and is used for the end position of the character representing temporal information.It is " thirty a whole night eight next week " and " ten one point " that the temporal information such as identified corresponds in the first character string, a time starting position is D score, another time end position is " partly ", another time starting position is " ten " of " ten one point ", and another time end position is " point " of " ten one point ".
Step 604, marks link text according to time starting position and time end position in the first character string.
Particularly, hyperlink being created to the character string that the character in the first character string between time starting position and time end position forms, generating the link text for triggering predetermined registration operation.Terminal 200 by from time starting position to the text string generation of time end position be link text.The operations such as predetermined registration operation such as copies connection text itself, preservation, or triggering enters application-specific.
In the present embodiment, by generating the link text corresponding to temporal information, making developer can trigger various operation based on the temporal information identified, adding interaction mode.
As shown in Figure 7, in one embodiment, step 602 comprises the following steps:
Step 702, marks the character types of each character in three-character doctrine sequence; Character types comprise from low to high according to priority: without the 5th character types mapping the original character type of process, the first character types passing through mapping process, the second character types representing relative-date, the three-character doctrine type representing relative instant, the 4th character types of expression absolute date and expression absolute time.
Particularly, character types can represent with character denotation position, and original character type character denotation position 0 represents, the first to the five character types represent with the character denotation position of the integer of 1 to 5 respectively.
When performing step 304, obtain the index-mapping table of the first character string shown in table 5, wherein idx represents call number, between table, mapping value map is used for table 5 and next index-mapping table table 6 to set up mapping relations, and the map value in table 5 represents the initial call number of corresponding character in the second character string.Char represents each character in the first character string, and tag then represents that the first character string is mapped as the character denotation position of each character marked in the process of the second character string.Visible, in the first character string, through the character that maps, to be all labeled as the first character types be 1, such as " next week one ", " 8 points ", " arriving ", " 11 ", " point ", " year " and ".", being then all labeled as original character type without the character mapped is 0.Specify that each character in same time word shares character types and map value.
Table 5:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
map 0 0 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
char under star phase one evening on eight point half arrive ten one point lift oK portion door year meeting .
tag 1 1 1 1 0 0 1 1 0 1 1 1 1 0 0 0 0 1 0 1
After step 304 is performed, according to table 5, the index-mapping table of the second character string as shown in table 6 can be obtained.Call number idx in table 6 determines according to the map value in table 5, and the character denotation position tag in table 6 determines according to the character denotation position tag in table 5.Between the table in table 6, mapping value map is used for table 6 and next index-mapping table table 7 to set up mapping relations, and the map value in table 6 represents the initial call number of corresponding character in three-character doctrine sequence.Specify that each character in same time word shares character types and map value.
Table 6:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
map 0 0 0 6 6 12 13 14 15 16 17 18 19 20 21 22 23 24 25
char Under Week 1 Evening On 8 : Half 1 1 : Lift OK Portion Door - Meeting -
tag 1 1 1 0 0 1 1 0 1 1 1 1 0 0 0 0 1 0 1
After step 306 is performed, according to table 6, obtain the index-mapping table of three-character doctrine sequence as shown in table 7.Call number idx in table 7 determines according to the map value in table 6, and the character denotation position tag in table 7 determines according to the character denotation position tag in table 6.Specify that each character in same time word shares character types.
Table 7:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12
char 2 0 0 8 $ 4 0 2 0 $ 8
tag 1 1 1 1 1 1 1 1 1 1 1 1 1
idx 13 14 15 16 17 18 19 20 21 22 23 24 25
char : Half 1 1 : Lift OK Portion Door - Meeting -
tag 1 0 1 1 1 1 0 0 0 0 1 0 1
After execution step 308, the character types of the character string of coupling can be obtained, mark according to the corresponding character denotation position 2 of the second character types, the corresponding character denotation position 3 of three-character doctrine type, the corresponding character denotation position 4 of the 4th character types and the corresponding character denotation position 5 of the 5th character types, updating form 7, obtains table 8.
Table 8:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12
char 2 0 0 8 $ 4 0 2 0 $ 8
tag 2 2 2 2 2 2 3 3 3 3 3 3 5
idx 13 14 15 16 17 18 19 20 21 22 23 24 25
char : Half 1 1 : Lift OK Portion Door - Meeting -
tag 5 5 1 5 5 5 0 0 0 0 1 0 1
Step 704, the character types marked according to corresponding three-character doctrine sequence determine the character types of each character in the first character string.
Particularly, carry out inverse mapping according to table 8, obtain the index-mapping table table 9 that former table 6 upgrades the second character string behind character denotation position.
Table 9:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
map 0 0 0 6 6 12 13 14 15 16 17 18 19 20 21 22 23 24 25
char Under Week 1 Evening On 8 : Half 1 1 : Lift OK Portion Door - Meeting -
tag 2 2 2 3 3 5 5 5 1 5 5 5 0 0 0 0 1 0 1
Carry out inverse mapping according to table 9 again, obtain the index-mapping table table 10 that former table 5 upgrades the first character string behind character denotation position:
idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
map 0 0 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
char under star phase one evening on eight point half arrive ten one point lift oK portion door year meeting .
tag 2 2 2 2 3 3 5 5 5 1 5 5 5 0 0 0 0 1 0 1
Step 706, according to character types and the corresponding priority of character each in the first character string, determines the time starting position in the first character string and time end position.
Particularly, as shown in Figure 8, in one embodiment, step 706 comprises the following steps:
Step 802, determines the character that character types priority corresponding in the first character string is the highest.
Particularly, as shown in table 10, travel through the character types of each character in the first character, get the value that the medium priority of character types is the highest in order, can get the maximal value 5 of character denotation position here, corresponding call number is idx=6, and corresponding character is " eight ".
Step 804, searches the character with the priority higher than the first character types of the character continuous adjacent the highest with priority.
Particularly, multiple character continuous adjacent refers to that wherein each character is linked to be a continuous print character string.As shown in table 10, travel through forward from idx=5 and travel through backward from idx=7, find character that character denotation position is greater than 1 be idx=[0,5] and idx=[7,9) character.
Step 806, time starting position and time end position are determined in the border of the character string that the character the highest according to priority and the character found are formed.
Particularly, as shown in table 10, by idx=[0,9) border idx=0 and idx=8 of character string that form of character respectively as time starting position and time end position.
Then, continue the character determining that the first character string remaining character string medium priority is the highest, search the character with the priority higher than the first character types of the character continuous adjacent the highest with priority, time starting position and time end position are determined, until the priority of the character types of all characters is all less than or equal to the priority of the first character types in the remaining character string of the first character string in the border of the character string that the character the highest according to priority and the character found are formed.Can determine that another time starting position is idx=10 again from the remaining character string of the first character string corresponding to table 10 like this, another time end position is idx=12.The final like this temporal information obtained is 2015-01-1920:30 ~ 2015-01-1923:00.
In the present embodiment, terminal 200 accurately can parse precise time from the character express of natural language, can realize the accurate identification of the temporal information of written form.
As shown in Figure 9, in one embodiment, this temporal information recognition methods also comprises the step according to temporal information configuration Preset Time related application, specifically comprises the steps:
Step 902, detects the trigger action to link text.
Particularly, as shown in Figure 10, terminal 200 receives many notes by phone book applications, and the temporal information in every bar note is identified and be labeled as link text.The trigger action of link text is comprised the cursor of link text is clicked, cursor is double-clicked, tap, long by and gesture operation etc.
Step 904, enters the configuration page of Preset Time related application according to trigger action.
Mobile terminal 200, after the trigger action to character link being detected, enters the configuration page for carrying out the configuration relevant to temporal information.Preset Time related application comprises calendar application, backlog planning application, alarm clock application and timing application etc.
Illustrate, as shown in figure 11, the content of a note is " tomorrow morning 9 starts fine honest jump ", and wherein " some tomorrow morning nine " is identified and is labeled as link text, and user clicks this link text, and terminal 200 enters actions menu.Show in actions menu this temporal information the first option copied and the second option entering backlog planning application, user clicks the second option, and terminal 200 enters the backlog planning application configuration page as shown in figure 12.
Step 906, at least one in the source configuring automatic input temporal information, the first character string and the first character string in the page.
Particularly, the temporal information of terminal 200 in the configuration page selects typing temporal information in control, the source of typing first character string and/or the first character string in explanatory note typing frame.As shown in figure 12, after entering the configuration page of backlog planning application, the time represented by the temporal information that identifies start time of direct typing backlog, and automatic input " tomorrow morning 9 starts fine honest jump (note from phone book applications) ".Terminal 200, after confirmation instruction being detected, preserves the deploy content of this backlog.Finally as shown in figure 13, terminal 200 shows the do list that setting completed.
In the present embodiment, be link text by the character marking in corresponding first character string after identifying temporal information, user can enter Preset Time related application by the triggering of this link text of operation and automatically configure, greatly simplifie the step of configuration Preset Time related application, improve operation ease.
With reference to Figure 19, in a specific embodiment, temporal information recognition methods is applied in Android operation system, to original for identifying that the bookmark tool-class function Linkify of telephone number and addresses of items of mail expands, add custom link marking tool class function PhoneBookLinkify.Also add self-defining Chinese time identification facility class function SmartCalendarUtil.In the present embodiment, temporal information recognition methods specifically comprises the steps:
1): Context (referring to context environmental, is movable effector) adds function addLinks () request custom link marking tool class function PhoneBookLinkify establishment link text by calling bookmark.
2): custom link marking tool class function PhoneBookLinkify gets the first character string by character string transfer function toString (), this first character string is the pending text of input.
3): custom link marking tool class function PhoneBookLinkify calls Chinese time identification facility class function SmartCalendarUtil by analytical function parse ().
4): Chinese time identification facility class function SmartCalendarUtil calls normalized function normalize (), utilizes the first dictionary tree NormalizerTree to carry out approximate expression normalized.Particularly, according to the mapping relations of Preset Time relevant character and preset standard character, in order the Preset Time relevant character in the first character string is mapped as corresponding preset standard character, obtains the second character string.This second character string is through the text of approximate expression normalized.
5): Chinese time identification facility class function SmartCalendarUtil calls normalized function normalize (), utilizes the second dictionary tree NormalizerTree to carry out characteristic value normalization process.Particularly, according to default relative time word and the mapping relations of feature string with default characteristic format, in order the default relative time word in the second character string is mapped as corresponding feature string, obtains three-character doctrine sequence.This three-character doctrine sequence is through the text of characteristic value normalization process.
6): Chinese time identification facility class function SmartCalendarUtil call format adaptation function recognize (), utilizes regular expression to carry out format match.Particularly, in three-character doctrine sequence, search the character string of mating with default absolute time format and default characteristic format respectively in order.
7): Chinese time identification facility class function SmartCalendarUtil utilizes the parsing of absolute moment analytical function parseAbsoluteTime () to obtain absolute time information.Particularly, absolute time information is converted into by carving character string with the absolute time presetting absolute moment format match.
8): Chinese time identification facility class function SmartCalendarUtil utilizes absolute date analytical function parseAbsoluteDate () to resolve and obtains absolute date information.Particularly, by with default absolute date format match absolute date character string be converted into absolute date information.
9): Chinese time identification facility class function SmartCalendarUtil utilizes time-triggered protocol function handle () to process relative time information.Particularly, according to the feature string mated with default characteristic format, time migration process is carried out to absolute time information and absolute date information.
10): Chinese time identification facility class function SmartCalendarUtil utilizes information renewal function update () update time, and obtain link start-stop index position, obtain analysis result ParserResult.Particularly, mark the character types of each character in three-character doctrine sequence, the character types marked according to corresponding three-character doctrine sequence determine the character types of each character in the first character string, according to character types and the corresponding priority of character each in the first character string, determine the time starting position in the first character string and time end position.
11): custom link marking tool class function PhoneBookLinkify gets analysis result ParserResult, generate link text date schema url, be packaged into link function LinkSpec () that bookmark tool-class function Linkify supports.
12): custom link marking tool class function PhoneBookLinkify call chain connects the link that utility function applylink () application creates.
13): custom link marking tool class function PhoneBookLinkify calls link and arranges the target text text:spannable of function setSpan () generation with link text.Particularly, link text is arranged to the first character string, obtain the first character string with link text.
14): Context calls text and arranges function setText () in text view control TextView, show target text with link text.Particularly, in text view control TextView, the first character string with link text is being shown.
15): text view control TextView utilizes click event function onclick () to detect the click event of the link text URLSpan to TextView display.
16): Context is after click event being detected, the interactive component Activity () corresponding to link text URLSpan of display is started, by interactive component Activity () and user interactions by interactive component run function startActivity ().
As shown in figure 14, in one embodiment, provide a kind of temporal information recognition device 1400, there is the function of the temporal information recognition methods realizing each embodiment above-mentioned.This temporal information recognition device 1400 comprises: the first character string acquisition module 1401, first mapping block 1402, second mapping block 1403, matched and searched module 1404 and temporal information determination module 1405.
First character string acquisition module 1401, for obtaining the first character string.
Character string refers to the data that the character that multiple existence puts in order forms successively.First character string acquisition module 1401 can be used for obtaining text in the message that receives in instant communications applications as the first character string, also for note that mobile phone is received as the first character string.First character string acquisition module 1401 can also be used for receiving selection instruction, selects character string as the first character string according to this selection instruction.
First mapping block 1402, for the mapping relations according to Preset Time relevant character and preset standard character, is mapped as corresponding preset standard character by the Preset Time relevant character in the first character string in order, obtains the second character string.
Preset Time relevant character refers to that preset with character that is time correlation, preset standard character is then the uniform character being used for expression time preset, and the mapping relations of Preset Time relevant character and preset standard character can be many-to-one mapping relations.Refer to putting in order according to character in the first character string in order.
In one embodiment, the mapping relations of Preset Time relevant character and preset standard character comprise the mapping relations shown in table 2.Sequence number in table 2 is used for distinguishing different mapping relations, does not limit order.Content in table 2 can also carry out expanding or changing as required.
In one embodiment, dictionary tree can be adopted to represent the mapping relations of Preset Time relevant character and preset standard character.Wherein dictionary tree is set also known as word lookup, is a kind of tree structure, and utilize the common prefix of character string to reduce query time, reduce meaningless character string comparison to greatest extent, search efficiency is high.
Second mapping block 1403, for according to presetting relative time word and the mapping relations of feature string with default characteristic format, being mapped as corresponding feature string by the default relative time word in the second character string in order, obtaining three-character doctrine sequence.
Default relative time word refers to the word for expressing relative time, is not limited to word, also can comprises numeral, or designated symbols, such as " week 1 ", " all # ".Feature string refers to that each feature string has default characteristic format, can identify feature string easily according to default characteristic format for representing the corresponding eigenwert presetting relative time word.Default characteristic format can be such as " %**** $ ", and " * " wherein represents numeral, and such form is that the character string of " %**** $ " can be found out easily by regular expression.Default characteristic format can also be " %****$ ", “ $****% " or " $ * * * * % " etc.Presetting relative time word and the mapping relations of feature string with default characteristic format is relation one to one.
In one embodiment, preset relative time word and comprise the mapping relations shown in table 3 with the mapping relations of the feature string with default characteristic format.Sequence number in table 3 is used for distinguishing different mapping relations, does not limit order.Content in table 3 can also carry out expanding or changing as required.
Matched and searched module 1404, for searching the character string of mating with default absolute time format and default characteristic format respectively in order in three-character doctrine sequence.
Matched and searched module 1404, specifically for adopting format match mode, specifically respectively using default absolute time format and default characteristic format as template, searches the character string with corresponding format match in order from three-character doctrine sequence.Refer to putting in order according to character in three-character doctrine sequence in order.Absolute time can determine separately concrete time value, and absolute time information then refers to the character for representing corresponding absolute time.Relative time can not determine occurrence separately, but can combine to determine concrete time value with absolute time, and relative time information then refers to the character for representing corresponding relative time.
Default absolute time format is based on the character through mapping, default absolute time format comprises " * # ", " *-* # " (corresponding * month * day), " *-*-* # " (corresponding * * month * day), " *: " (during corresponding *) and " *: *:: " (during corresponding *, * divides), the wherein numeral of " * " all representative digit or written form.Format match can be realized by regular expression.
Temporal information determination module 1405, for the character string determination temporal information according to coupling.
In one embodiment, temporal information determination module 1405 also for judging whether temporal information meets temporal information rule, if this temporal information does not meet temporal information rule, then abandons this temporal information.
In one embodiment, temporal information determination module 1405 also for judging that whether temporal information is expired, if this temporal information is expired, then abandons this temporal information.The present embodiment is applicable to require higher scene, the setting of such as backlog to timeliness.
Above-mentioned temporal information recognition device 1400, first the Preset Time relevant character in the first character string is mapped as corresponding preset standard character to obtain the second character string, the multiple approximate expression unification preset standard character of time can be expressed, to facilitate follow-up identification like this.Then the default relative time word in the second character string is mapped as the feature string of corresponding default characteristic format to obtain three-character doctrine sequence.Temporal information in such three-character doctrine sequence all transforms the data in order to format, and by format match mode, searches the character string of mating with default absolute time format and default characteristic format, and then utilizes the character string of coupling just can determine temporal information.Can realize like this identifying temporal information from natural language word, and then various operation can be carried out according to the temporal information identified, be applied to various needs in the scene of temporal information.
In one embodiment, temporal information determination module 1405 also for obtaining absolute time information and relative time information, and carries out time migration process according to relative time information to absolute time information; At least one wherein in absolute time information and relative time information is determined according to the character string of coupling.In the present embodiment, the feature of simulation human brain processing time information, first pay close attention to absolute time information and/or absolute date information, rear use relative time information and/or date and time information are modified absolute time information and/or absolute date information, can go out the temporal information of literal expression by accurate analysis.In one embodiment, temporal information determination module 1405 also for obtaining the absolute time information carved with the absolute time of the default absolute moment format match included by default absolute time format corresponding to character string, and carries out time migration process according to the feature string mated with default characteristic format to absolute time information; It is adjacent that the feature string of coupling and the absolute time mated carve character string.
In one embodiment, temporal information determination module 1405 also for obtaining the absolute time information carved with the absolute time of the default absolute moment format match included by default absolute time format corresponding to character string, and carries out time migration process according to current system time information to absolute time information.
In one embodiment, temporal information determination module 1405 also for obtaining current system date and time information as absolute date information, and carries out time migration process according to the feature string mated with default characteristic format to absolute date information.
In one embodiment, temporal information determination module 1405, also for obtaining calculations of offset type code position in the feature string that mates with default characteristic format and calculations of offset parameter, adopting this offset function corresponding to calculations of offset type code position and carrying out time migration process according to this calculations of offset parameter to absolute time information.
In one embodiment, temporal information determination module 1405, also for obtaining calculations of offset type code position in the feature string that mates with default characteristic format and calculations of offset parameter, adopting this offset function corresponding to calculations of offset type code position and carrying out time migration process according to this calculations of offset parameter to absolute date information.
As shown in figure 15, in one embodiment, this temporal information recognition device 1400 also comprises: position determination module 1406 and link text mark module 1407.
Position determination module 1406, for determining that temporal information corresponds to time starting position in the first character string and time end position.Particularly, time starting position refers in the first character string the reference position of character being used for representing temporal information, and corresponding time end position then refers in the first character string and is used for the end position of the character representing temporal information.It is " thirty a whole night eight next week " and " ten one point " that the temporal information such as identified corresponds in the first character string, a time starting position is D score, another time end position is " partly ", another time starting position is " ten " of " ten one point ", and another time end position is " point " of " ten one point ".
Link text mark module 1407, for marking link text according to time starting position and time end position in the first character string.Particularly, link text mark module 1407 creates hyperlink for the character string formed the character in the first character string between time starting position and time end position, generates the link text for triggering predetermined registration operation.Link text mark module 1407 for by from time starting position to the text string generation of time end position be link text.The operations such as predetermined registration operation such as copies connection text itself, preservation, or triggering enters application-specific.
In the present embodiment, by generating the link text corresponding to temporal information, making developer can trigger various operation based on the temporal information identified, adding interaction mode.
As shown in figure 16, in one embodiment, position determination module 1406 comprises: character types mark module 1406a, character types mapping block 1406b and execution module 1406c.
Character types mark module 1406a, for marking the character types of each character in three-character doctrine sequence; Character types comprise from low to high according to priority: without the 5th character types mapping the original character type of process, the first character types passing through mapping process, the second character types representing relative-date, the three-character doctrine type representing relative instant, the 4th character types of expression absolute date and expression absolute time.
Character types mapping block 1406b, the character types for marking according to corresponding three-character doctrine sequence determine the character types of each character in the first character string.
Execution module 1406c, for according to the character types of character each in the first character string and corresponding priority, determines the time starting position in the first character string and time end position.
As shown in figure 17, in one embodiment, execution module 1406c comprises: the first character processing module 1406c1, the second character processing module 1406c2 and three-character doctrine processing module 1406c3.
First character processing module 1406c1, for determining the character that character types priority corresponding in the first character string is the highest.
Second character processing module 1406c2, for searching the character with the priority higher than the first character types of the character continuous adjacent the highest with priority.
Three-character doctrine processing module 1406c3, time starting position and time end position are determined in the border for the character string formed according to the highest character of priority and the character that finds.
As shown in figure 18, in one embodiment, this temporal information recognition device 1400 also comprises: trigger action detection module 1408, page trigger module 1409 and automatic input module 1410.
Trigger action detection module 1408, for detecting the trigger action to link text.
Page trigger module 1409, for entering the configuration page of Preset Time related application according to trigger action.
Automatic input module 1410, at least one in the source of automatic input temporal information, the first character string and the first character string in the configuration page.
In the present embodiment, be link text by the character marking in corresponding first character string after identifying temporal information, user can enter Preset Time related application by the triggering of this link text of operation and automatically configure, greatly simplifie the step of configuration Preset Time related application, improve operation ease.
One of ordinary skill in the art will appreciate that all or part of flow process realized in above-described embodiment method, that the hardware that can carry out instruction relevant by computer program has come, described program can be stored in a computer read/write memory medium, this program, when performing, can comprise the flow process of the embodiment as above-mentioned each side method.Wherein, described storage medium can be the non-volatile memory mediums such as magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM), or random store-memory body (Random Access Memory, RAM) etc.
Each technical characteristic of the above embodiment can combine arbitrarily, for making description succinct, the all possible combination of each technical characteristic in above-described embodiment is not all described, but, as long as the combination of these technical characteristics does not exist contradiction, be all considered to be the scope that this instructions is recorded.
The above embodiment only have expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but can not therefore be construed as limiting the scope of the patent.It should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (16)

1. a temporal information recognition methods, described method comprises:
Obtain the first character string;
According to the mapping relations of Preset Time relevant character and preset standard character, in order the Preset Time relevant character in described first character string is mapped as corresponding preset standard character, obtains the second character string;
According to default relative time word and the mapping relations of feature string with default characteristic format, in order the default relative time word in described second character string is mapped as corresponding feature string, obtains three-character doctrine sequence;
The character string of mating with default absolute time format and default characteristic format is respectively searched in order in described three-character doctrine sequence;
According to the character string determination temporal information of coupling.
2. method according to claim 1, it is characterized in that, the described character string determination temporal information according to coupling, comprising: obtain absolute time information and relative time information, and carry out time migration process according to described relative time information to described absolute time information; At least one wherein in absolute time information and relative time information is determined according to the character string of coupling.
3. method according to claim 1, is characterized in that, the described character string determination temporal information according to coupling, comprising:
Acquisition carves the absolute time information corresponding to character string with the absolute time of the default absolute moment format match included by default absolute time format, and carries out time migration process according to the feature string mated with described default characteristic format to this absolute time information; It is adjacent that feature string and the described absolute time mated of described coupling carve character string; And/or,
Acquisition carves the absolute time information corresponding to character string with the absolute time of the default absolute moment format match included by default absolute time format, and carries out time migration process according to current system time information to this absolute time information; And/or,
Obtain current system date and time information as absolute date information, and according to the feature string mated with described default characteristic format, time migration process is carried out to described absolute date information.
4. method according to claim 3, is characterized in that, the feature string that described basis is mated with described default characteristic format carries out time migration process to described absolute time information, comprising:
Obtain the calculations of offset type code position in the feature string that mates with described default characteristic format and calculations of offset parameter, adopt this offset function corresponding to calculations of offset type code position and according to this calculations of offset parameter, time migration process carried out to described absolute time information; And/or,
The feature string that described basis is mated with described default characteristic format carries out time migration process to described absolute date information, comprising:
Obtain the calculations of offset type code position in the feature string that mates with described default characteristic format and calculations of offset parameter, adopt this offset function corresponding to calculations of offset type code position and according to this calculations of offset parameter, time migration process carried out to described absolute date information.
5. method according to claim 1, is characterized in that, described method also comprises:
Determine that described temporal information corresponds to time starting position in described first character string and time end position;
In described first character string, link text is marked according to described time starting position and time end position.
6. method according to claim 5, is characterized in that, describedly determines that described temporal information corresponds to time starting position in described first character string and time end position, comprising:
Mark the character types of each character in described three-character doctrine sequence; Described character types comprise from low to high according to priority: without the 5th character types mapping the original character type of process, the first character types passing through mapping process, the second character types representing relative-date, the three-character doctrine type representing relative instant, the 4th character types of expression absolute date and expression absolute time;
The character types marked according to the described three-character doctrine sequence of correspondence determine the character types of each character in described first character string;
According to character types and the corresponding priority of each character in described first character string, determine the time starting position in described first character string and time end position.
7. method according to claim 6, is characterized in that, the described character types according to each character in described first character string and corresponding priority, determines the time starting position in described first character string and time end position, comprising:
Determine the character that character types priority corresponding in described first character string is the highest;
Search the character with the priority higher than the first character types of the character continuous adjacent the highest with described priority;
Time starting position and time end position are determined in the border of the character string that the character the highest according to described priority and the described character found are formed.
8. basis just method according to claim 5, it is characterized in that, described method also comprises:
Detect the trigger action to described link text;
The configuration page of Preset Time related application is entered according to described trigger action;
In the described configuration page temporal information, described first character string and described first character string described in automatic input source at least one.
9. a temporal information recognition device, is characterized in that, described device comprises:
First character string acquisition module, for obtaining the first character string;
First mapping block, for the mapping relations according to Preset Time relevant character and preset standard character, is mapped as corresponding preset standard character by the Preset Time relevant character in described first character string in order, obtains the second character string;
Second mapping block, for according to presetting relative time word and the mapping relations of feature string with default characteristic format, being mapped as corresponding feature string by the default relative time word in described second character string in order, obtaining three-character doctrine sequence;
Matched and searched module, for searching the character string of mating with default absolute time format and default characteristic format respectively in order in described three-character doctrine sequence;
Temporal information determination module, for the character string determination temporal information according to coupling.
10. device according to claim 9, is characterized in that, described temporal information determination module also for obtaining absolute time information and relative time information, and carries out time migration process according to described relative time information to described absolute time information; At least one wherein in absolute time information and relative time information is determined according to the character string of coupling.
11. devices according to claim 9, it is characterized in that, described temporal information determination module also for obtaining the absolute time information carved with the absolute time of the default absolute moment format match included by default absolute time format corresponding to character string, and carries out time migration process according to the feature string mated with described default characteristic format to this absolute time information; It is adjacent that feature string and the described absolute time mated of described coupling carve character string; And/or,
Described temporal information determination module also for obtaining the absolute time information carved with the absolute time of the default absolute moment format match included by default absolute time format corresponding to character string, and carries out time migration process according to current system time information to this absolute time information; And/or,
Described temporal information determination module also for obtaining current system date and time information as absolute date information, and carries out time migration process according to the feature string mated with described default characteristic format to described absolute date information.
12. devices according to claim 11, it is characterized in that, described temporal information determination module, also for obtaining calculations of offset type code position in the feature string that mates with described default characteristic format and calculations of offset parameter, adopting this offset function corresponding to calculations of offset type code position and carrying out time migration process according to this calculations of offset parameter to described absolute time information; And/or,
Described temporal information determination module, also for obtaining calculations of offset type code position in the feature string that mates with described default characteristic format and calculations of offset parameter, adopting this offset function corresponding to calculations of offset type code position and carrying out time migration process according to this calculations of offset parameter to described absolute date information.
13. devices according to claim 9, is characterized in that, described device also comprises:
Position determination module, for determining that described temporal information corresponds to time starting position in described first character string and time end position;
Link text mark module, for marking link text according to described time starting position and time end position in described first character string.
14. devices according to claim 13, is characterized in that, described position determination module comprises:
Character types mark module, for marking the character types of each character in described three-character doctrine sequence; Described character types comprise from low to high according to priority: without the 5th character types mapping the original character type of process, the first character types passing through mapping process, the second character types representing relative-date, the three-character doctrine type representing relative instant, the 4th character types of expression absolute date and expression absolute time;
Character types mapping block, the character types for marking according to the described three-character doctrine sequence of correspondence determine the character types of each character in described first character string;
Execution module, for according to the character types of each character in described first character string and corresponding priority, determines the time starting position in described first character string and time end position.
15. devices according to claim 14, is characterized in that, described execution module comprises:
First character processing module, for determining the character that character types priority corresponding in described first character string is the highest;
Second character processing module, for searching the character with the priority higher than the first character types of the character continuous adjacent the highest with described priority;
Three-character doctrine processing module, time starting position and time end position are determined in the border for the character string formed according to the highest character of described priority and the described character found.
16. bases just device according to claim 13, it is characterized in that, described device also comprises:
Trigger action detection module, for detecting the trigger action to described link text;
Page trigger module, for entering the configuration page of Preset Time related application according to described trigger action;
Automatic input module, at least one in the source of temporal information, described first character string and described first character string described in automatic input in the described configuration page.
CN201510263225.0A 2015-05-21 2015-05-21 Temporal information recognition methods and device Active CN104951508B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510263225.0A CN104951508B (en) 2015-05-21 2015-05-21 Temporal information recognition methods and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510263225.0A CN104951508B (en) 2015-05-21 2015-05-21 Temporal information recognition methods and device

Publications (2)

Publication Number Publication Date
CN104951508A true CN104951508A (en) 2015-09-30
CN104951508B CN104951508B (en) 2017-11-21

Family

ID=54166166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510263225.0A Active CN104951508B (en) 2015-05-21 2015-05-21 Temporal information recognition methods and device

Country Status (1)

Country Link
CN (1) CN104951508B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622045A (en) * 2017-08-09 2018-01-23 联动优势科技有限公司 A kind of information processing method and equipment
CN107729314A (en) * 2017-09-29 2018-02-23 东软集团股份有限公司 A kind of Chinese time recognition methods, device and storage medium, program product
CN109586830A (en) * 2018-11-22 2019-04-05 中电科技扬州宝军电子有限公司 A kind of high-speed rail sync identification method and device based on Beidou Navigation System and PTP
CN109871242A (en) * 2019-02-01 2019-06-11 天津字节跳动科技有限公司 Task rebuilding method and device
CN110688398A (en) * 2019-08-21 2020-01-14 西藏自治区藏医院(西藏自治区藏医药研究院) Method and system for demonstrating Tibetan astronomical calendar
CN111177418A (en) * 2019-12-25 2020-05-19 深圳市优必选科技股份有限公司 Method and device for acquiring time text and storage medium
CN111222324A (en) * 2019-12-27 2020-06-02 南京医睿科技有限公司 Time identification method and device, computer readable storage medium and electronic equipment
CN111639491A (en) * 2020-05-18 2020-09-08 华青融天(北京)软件股份有限公司 Time data extraction method and device and electronic equipment
CN113297826A (en) * 2020-06-28 2021-08-24 上海交通大学 Method for marking on natural language text
CN115878924A (en) * 2021-09-27 2023-03-31 小沃科技有限公司 Data processing method, device, medium and electronic equipment based on double dictionary trees

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030051211A1 (en) * 2001-08-06 2003-03-13 Hitachi, Ltd. Time information display system
JP2004252861A (en) * 2003-02-21 2004-09-09 Canon Inc Information processing apparatus
CN1901711A (en) * 2005-07-20 2007-01-24 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with message-based calendar management function and acting method thereof
CN101140570A (en) * 2006-09-04 2008-03-12 富士施乐株式会社 Translating device, translating method and computer readable medium
CN102955832A (en) * 2011-08-31 2013-03-06 深圳市华傲数据技术有限公司 Correspondence address identifying and standardizing system
CN103093334A (en) * 2011-11-04 2013-05-08 周超然 Method of activity notice text recognition and transforming automatically into calendar term
CN104268157A (en) * 2014-09-03 2015-01-07 乐视网信息技术(北京)股份有限公司 Device and method for error correction in data search

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030051211A1 (en) * 2001-08-06 2003-03-13 Hitachi, Ltd. Time information display system
JP2004252861A (en) * 2003-02-21 2004-09-09 Canon Inc Information processing apparatus
CN1901711A (en) * 2005-07-20 2007-01-24 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with message-based calendar management function and acting method thereof
CN101140570A (en) * 2006-09-04 2008-03-12 富士施乐株式会社 Translating device, translating method and computer readable medium
CN102955832A (en) * 2011-08-31 2013-03-06 深圳市华傲数据技术有限公司 Correspondence address identifying and standardizing system
CN103093334A (en) * 2011-11-04 2013-05-08 周超然 Method of activity notice text recognition and transforming automatically into calendar term
CN104268157A (en) * 2014-09-03 2015-01-07 乐视网信息技术(北京)股份有限公司 Device and method for error correction in data search

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
周小甲 等: "中文病历文本中时间信息自动标注", 《中国生物医学工程学报》 *
左亚尧 等: "基于规则的中文时间表达式识别与规范化", 《广东工业大学学报》 *
徐永东 等: "中文文本时间信息获取及语义计算", 《哈尔滨工业大学学报》 *
王伟 等: "C-TERN:一种基于CFSA的军事新闻文本时间信息处理算法", 《北京大学学报(自然科学版)》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622045A (en) * 2017-08-09 2018-01-23 联动优势科技有限公司 A kind of information processing method and equipment
CN107622045B (en) * 2017-08-09 2021-02-23 联动优势科技有限公司 Information processing method and device
CN107729314B (en) * 2017-09-29 2021-10-26 东软集团股份有限公司 Chinese time identification method and device, storage medium and program product
CN107729314A (en) * 2017-09-29 2018-02-23 东软集团股份有限公司 A kind of Chinese time recognition methods, device and storage medium, program product
CN109586830A (en) * 2018-11-22 2019-04-05 中电科技扬州宝军电子有限公司 A kind of high-speed rail sync identification method and device based on Beidou Navigation System and PTP
CN109871242A (en) * 2019-02-01 2019-06-11 天津字节跳动科技有限公司 Task rebuilding method and device
CN110688398A (en) * 2019-08-21 2020-01-14 西藏自治区藏医院(西藏自治区藏医药研究院) Method and system for demonstrating Tibetan astronomical calendar
CN110688398B (en) * 2019-08-21 2023-10-13 西藏自治区藏医院(西藏自治区藏医药研究院) Demonstration method and system for Tibetan calendar astronomical calendar
CN111177418A (en) * 2019-12-25 2020-05-19 深圳市优必选科技股份有限公司 Method and device for acquiring time text and storage medium
CN111222324A (en) * 2019-12-27 2020-06-02 南京医睿科技有限公司 Time identification method and device, computer readable storage medium and electronic equipment
CN111639491A (en) * 2020-05-18 2020-09-08 华青融天(北京)软件股份有限公司 Time data extraction method and device and electronic equipment
CN113297826A (en) * 2020-06-28 2021-08-24 上海交通大学 Method for marking on natural language text
CN113297826B (en) * 2020-06-28 2022-06-10 上海交通大学 Method for marking on natural language text
CN115878924A (en) * 2021-09-27 2023-03-31 小沃科技有限公司 Data processing method, device, medium and electronic equipment based on double dictionary trees
CN115878924B (en) * 2021-09-27 2024-03-12 小沃科技有限公司 Data processing method, device, medium and electronic equipment based on double dictionary trees

Also Published As

Publication number Publication date
CN104951508B (en) 2017-11-21

Similar Documents

Publication Publication Date Title
CN104951508A (en) Time information identification method and device
CN109857327B (en) Information processing apparatus, information processing method, and storage medium
EP2570974B1 (en) Automatic crowd sourcing for machine learning in information extraction
Borodin et al. More than meets the eye: a survey of screen-reader browsing strategies
WO2021017735A1 (en) Smart contract formal verification method, electronic apparatus and storage medium
US20130185622A1 (en) Methods and systems for handling annotations and using calculation of addresses in tree-based structures
US20100100816A1 (en) Method and system for accessing textual widgets
CN104731589A (en) Automatic generation method and device of user interface (UI)
CN101427229A (en) Technique for modifying presentation of information displayed to end users of a computer system
CN102141868B (en) Method for quickly operating information interaction page, input method system and browser plug-in
US10817651B2 (en) Method for referring to specific content on a web page and web browsing system
JP2017520834A (en) Data settings for user contact entries
US10074104B2 (en) Content dynamically targetted according to context
CN102609264A (en) Method and device for generating calling codes by calling application programming interfaces
CN104657451A (en) Processing method and processing device for page
US20190087397A1 (en) Human-computer interaction method and apparatus thereof
US20180113858A1 (en) Interface layout interference detection
US20080282150A1 (en) Finding important elements in pages that have changed
CN106970758B (en) Electronic document operation processing method and device and electronic equipment
CN113641433A (en) Multi-language page conversion method and unit of front-end internationalized multi-language file based on i18n technology
CN104239454A (en) Searching method and device
US20190042549A1 (en) Method and apparatus for building pages, apparatus and non-volatile computer storage medium
CN111142871A (en) Front-end page development system, method, equipment and medium
CN105739717A (en) Information input method and device
US11550990B2 (en) Machine first approach for identifying accessibility, non-compliances, remediation techniques and fixing at run-time

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant