CN109190119B - Time extraction method and device, storage medium and electronic device - Google Patents

Time extraction method and device, storage medium and electronic device Download PDF

Info

Publication number
CN109190119B
CN109190119B CN201810960868.4A CN201810960868A CN109190119B CN 109190119 B CN109190119 B CN 109190119B CN 201810960868 A CN201810960868 A CN 201810960868A CN 109190119 B CN109190119 B CN 109190119B
Authority
CN
China
Prior art keywords
time
segmentation
instruction
word
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810960868.4A
Other languages
Chinese (zh)
Other versions
CN109190119A (en
Inventor
包恒耀
赵学敏
苏可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810960868.4A priority Critical patent/CN109190119B/en
Publication of CN109190119A publication Critical patent/CN109190119A/en
Application granted granted Critical
Publication of CN109190119B publication Critical patent/CN109190119B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a time extraction method and device, a storage medium and an electronic device. Wherein, the method comprises the following steps: acquiring an instruction text matched with an input query instruction; performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively provided with a part-of-speech tag; determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag; matching the time segmentation word subset with a pre-configured time type template to obtain a time type matched with the time segmentation words contained in the time segmentation word subset; and converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification. The invention solves the technical problem of low accuracy of time extraction in the related technology.

Description

Time extraction method and device, storage medium and electronic device
Technical Field
The invention relates to the field of computers, in particular to a time extraction method and device, a storage medium and an electronic device.
Background
In the instruction input by the user to the hardware device, some time information is often carried, such as information including time strings representing the year, month, day, hour, minute, second, and the like. In order to facilitate the hardware device to perform corresponding machine processing operations on the time string carried in the time information, the time string often needs to be extracted from the instruction.
At present, after a hardware device obtains an instruction text corresponding to an instruction, a common extraction method is as follows: and simply matching the instruction text by using a regular matching formula to extract a time string carried by the time information in the instruction text. However, a special time string, such as a meaningless chinese character time string, often appears in the command text, and the time in the special time string can be extracted by further rule calculation so as to be converted into a standard time format. That is, the time extraction method adopted in the related art causes a problem of low accuracy of time extraction.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a time extraction method and device, a storage medium and an electronic device, and aims to at least solve the technical problem of low accuracy of time extraction in the related art.
According to an aspect of an embodiment of the present invention, there is provided a time extraction method including: acquiring an instruction text matched with an input query instruction; performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag; determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag; matching the time segmentation word subset with a pre-configured time type template to obtain a time type matched with the time segmentation words contained in the time segmentation word subset; and converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification.
According to another aspect of the embodiments of the present invention, there is also provided a time extraction apparatus, including: the acquisition unit is used for acquiring an instruction text matched with the input query instruction; the processing unit is used for performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively provided with a part-of-speech tag; a first determining unit, configured to determine a time segmentation subset from the instruction segmentation set according to the part-of-speech tag; the matching unit is used for matching the time segmentation word subsets with a pre-configured time type template so as to obtain time types matched with the time segmentation words contained in the time segmentation word subsets; and the converting unit is used for converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification.
As an alternative example, the matching unit includes: an obtaining module, configured to obtain a position relationship between the time segments in the time segment subset; the first determining module is used for determining the time word segmentation combination and the independent time word segmentation contained in the time word segmentation subset by using the position relation; and the first matching module is used for matching the time segmentation word combination and the independent time segmentation word in the time segmentation word subset with the time type template so as to obtain the time type. (ii) a
As an optional example, the first determining module includes: a merging submodule, configured to merge at least two time segmentations when positions of at least two time segmentations in the time segmentation subset are continuous positions and the at least two time segmentations belong to different time units, so as to obtain the time segmentation combination; and the first determining sub-module is used for taking the time participle at the discrete position as the independent time participle under the condition that the position of the time participle in the time participle subset is the discrete position.
As an alternative example, the first matching module includes: a first matching sub-module, configured to match the time segmentation combinations and the independent time segmentation in the time segmentation subsets with the stage time templates when the time type templates include stage time templates; a second determining sub-module, configured to determine that, when the time segmentation subsets include a phase time pair matched with the phase time template, the time types corresponding to the time segmentation subsets include a phase time type, where the phase time pair includes one of: the time segmentation combination and the time segmentation combination, the independent time segmentation and the independent time segmentation, and the time segmentation combination and the independent time segmentation.
As an alternative example, the first matching module includes: a third determining sub-module, configured to determine, when a target condition is met, that the time-segmented-word combination and the independent time segmented word are designated times, where the time type corresponding to the time-segmented-word subset includes a designated time type, where the target condition includes one of: the time segmentation method comprises the steps of matching under the condition that a stage time template and a repetition time template are not contained, matching under the condition that a stage time pair matched with the stage time template is not contained in the time segmentation subset and a repetition time segmentation word matched with the repetition time template is not contained, and matching in the time segmentation combination and the independent time segmentation word except the stage time pair matched with the stage time template and the repetition time segmentation word matched with the repetition time template in the time segmentation subset.
As an alternative example, the conversion unit includes: a completion module for completing the time information corresponding to the default time unit in the time participle combination and the independent time participle to obtain a complete time field; and the conversion module is used for converting the complete time field according to the time type to obtain the target time field.
As an optional example, the completion module includes: the analysis submodule is used for carrying out semantic analysis on the instruction text; a fourth determining submodule, configured to determine the time information to be completed according to a result of the semantic analysis; and the completion submodule is used for completing the time participle combination and the independent time participle by utilizing the time information to obtain the complete time field.
As an alternative example, the first determining unit includes: the second determining module is used for determining the first instruction participle of which the part of speech label is indicated as a time label from the instruction participle set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset; a third determining module, configured to determine, from the instruction word segmentation set, a second instruction word whose part-of-speech tag indicates a digital tag; matching the second instruction participle with a basic time template; and obtaining the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset. As an optional example, the third determining module includes: and the second matching submodule is used for matching the second instruction participle with the basic time template under the condition that the number contained in the second instruction participle is an integer type, wherein the basic time template comprises time unit fields matched with different time units.
As an optional example, the first determining unit further includes: a second matching module, configured to, after determining a time segmentation subset from the instruction segmentation set according to the part-of-speech tag, match the time segmentation subset with a repetition time template when the basic time template includes the repetition time template; and a fourth determining module, configured to determine that the time type corresponding to the time segmentation subset includes a repetition time type when the time segmentation subset includes a repetition time segmentation matched with the repetition time template.
According to a further aspect of the embodiments of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is configured to execute the above time extraction method when running.
According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the time extraction method through the computer program.
In the embodiment of the invention, the instruction text matched with the input query instruction is acquired; performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set; each instruction participle in the instruction participle set is respectively configured with a part-of-speech tag; determining a time participle subset from the instruction participle set according to the part-of-speech tag, matching the time participle subset with a time type template to obtain a time type, and obtaining a target time field after converting the time participle according to the time type. In the method, after the instruction text matched with the query instruction is obtained, word segmentation labeling processing is performed on the instruction text to obtain an instruction word segmentation set, a time word segmentation subset is obtained from instruction word segmentation combination according to the labeling operation, the time word segmentation subset is further matched with a time type template after being determined, and the time word segmentation is converted according to the time type obtained through matching to obtain a target time field. Thereby improving the accuracy of extracting the target number. And the technical problem of low accuracy of time extraction in the related technology is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic diagram of an application environment of an alternative time extraction method according to an embodiment of the invention;
FIG. 2 is a schematic flow diagram of an alternative time extraction method according to an embodiment of the invention;
FIG. 3 is a schematic diagram of an alternative time extraction method according to an embodiment of the invention;
FIG. 4 is a schematic diagram of another alternative time extraction method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of yet another alternative time extraction method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of yet another alternative time extraction method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of yet another alternative time extraction method according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of yet another alternative time extraction method according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of an alternative time extraction apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiments of the present invention, there is provided a time extraction method, which may be applied to, but not limited to, the environment shown in fig. 1 as an optional implementation manner.
Human-computer interaction between the user 102 and the user device 104 is possible. The user equipment 104 includes a memory 106 and a processor 108. The user device 104 may obtain a query instruction input by a user, and obtain an instruction text matched with the query instruction according to the query instruction. After obtaining the instruction text, the user equipment 104 sends the instruction text to the server 112 through the network. The server 112 includes a database 114, a segmentation engine 116, a matching engine 118, and a translation engine 120. After the server 112 obtains the instruction text, the instruction text may be stored in the database 114. Then, the word segmentation engine 116 is used to segment and label the instruction text, so as to obtain a time segmentation subset. The matching engine 118 obtains the time type according to the matching result of the time segmentation subset and the time type template. The translation engine 120 translates the time-segmented word subsets according to the time types to obtain target time fields, and the server 112 returns the target time fields to the user device 104.
It should be noted that, in the related art, when extracting the target time field in the text, since the target time field usually contains some special character strings and meaningless time information, the accuracy of the obtained result is not high when obtaining the target time field in the text. In the embodiment, in the process of extracting the target time field, word segmentation labeling processing is performed on the instruction text to obtain an instruction word segmentation set, and a part-of-speech tag is configured for each instruction word in the instruction word segmentation set, so that a time word segmentation subset can be determined. And matching the time word subsets according to the time type template to obtain time types, and converting the time words in the time word subsets according to the time types to obtain target time fields. Therefore, the target time field can be accurately and efficiently extracted, and the efficiency of extracting the target time field is improved.
Optionally, the time extraction method may be applied to, but not limited to, a terminal capable of calculating data, for example, a notebook computer, a PC, a smart phone, a smart speaker, a smart home, a head-mounted device, and the like, where the network may include, but is not limited to, a wireless network or a wired network. Wherein, this wireless network includes: WIFI and other networks that enable wireless communication. Such wired networks may include, but are not limited to: wide area networks, metropolitan area networks, and local area networks. The server may include, but is not limited to, any hardware device capable of performing computations.
Optionally, as an optional implementation manner, as shown in fig. 2, the time extraction method includes:
s202, acquiring an instruction text matched with the input query instruction;
s204, performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag;
s206, determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag;
s208, matching the time participle subset with a pre-configured time type template to obtain a time type matched with the time participles contained in the time participle subset;
s210, converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification.
Optionally, the time extraction method may be applied to, but not limited to, a scene in which time information is extracted from a text after the text is acquired. For example, in extracting news occurrence time, or extracting time in a memo.
Alternatively, the time extraction method may be applied to, but not limited to, an application installed on the terminal, for example, an application such as a pay pal.
For example, the following description will be made with the above-described time extraction method applied to a process of extracting news occurrence time. The news is recorded with the time when the news occurred. After the instruction text is obtained, word segmentation labeling processing is carried out on the instruction text to obtain an instruction word segmentation set, and a time word segmentation subset is determined from instruction word segmentation combination according to the part-of-speech tag of each instruction word. And matching the time word subsets with the time type template to obtain time types matched with the time words, and converting the time words according to the time types to obtain target time fields for machine identification.
In the method, the instruction text is subjected to word segmentation labeling processing to obtain an instruction word segmentation set, and each instruction word segmentation is configured with a part-of-speech tag in the instruction word segmentation set, so that a time word segmentation subset can be determined. And matching the time word subsets according to the time type template to obtain time types, and converting the time words in the time word subsets according to the time types to obtain target time fields. Therefore, the target time field can be accurately and efficiently extracted, and the efficiency of extracting the target time field is improved.
Optionally, the obtaining of the instruction text matching the input query instruction may be, but is not limited to, by the following means:
(1) and displaying an input box on a display interface of the terminal, and taking the received content as the instruction text when receiving the content input by the input box.
For example, an input box is displayed on a display interface of the terminal, and after a character of "eight to ten alarm clocks in tomorrow am" is input in the input box, the character of "eight to ten alarm clocks in tomorrow am" is used as an instruction text.
(2) Receiving a picture carrying an instruction text, identifying character information from the picture, and taking the identified character information as the instruction text.
For example, the terminal receives a picture carrying a word of "eight to ten alarm clocks in tomorrow am", identifies the picture, identifies the word of "eight to ten alarm clocks in tomorrow am", collects the words, and uses the collected "eight to ten alarm clocks in tomorrow am" as an instruction text.
(3) And after receiving the selection instruction, taking the selected text as an instruction text.
Alternatively, the receiving of the selected command may be, but is not limited to, pressing a button of a display interface of the terminal, receiving a voice command input by a user, and the like.
For example, a button and text content may be displayed on the display interface of the terminal. And when the button is pressed, taking the character content selected by the user as an instruction text, and executing a subsequent digit extraction process.
(4) And acquiring voice input information, and taking the acquired voice input information as an instruction text.
For example, voice input information input by a user, such as "an alarm clock at eight to ten am tomorrow", is received, and the obtained voice information is converted into text information and used as an instruction text.
This is explained below with reference to fig. 3. As shown in fig. 3, two buttons are displayed on the display interface of the terminal, one button is an input button for inputting an instruction text, and after the input button is pressed, voice input information is collected and converted into text information for display. And the other button is an extraction button, and after the extraction button is pressed, the collected voice input information is used as an instruction text, and a target time field in the instruction text is extracted. Alternatively, a selection instruction may be received when the instruction text is acquired. And using the selected voice input information as instruction text. As shown in fig. 4, the underlined voice input information in fig. 4 is the selected voice input information. After detecting that the extraction button is pressed, "alarm clock eight to ten am tomorrow" is used as the instruction text.
Optionally, the word segmentation tagging processing performed on the instruction text may be, but is not limited to, splitting the obtained instruction text into a plurality of separate fields, and adding a part-of-speech tag to each field.
Optionally, the above-mentioned adding part-of-speech tag to each field may be, but is not limited to, determining the part-of-speech of each field. When the part of speech is a noun part of speech, adding a noun part of speech tag to the field; adding a part-of-speech tag of a digit to the field when the part-of-speech is the part-of-speech of the digit; if the part of speech is verb part of speech, adding a verb part of speech tag to the field; adding an adjective label to the field under the condition that the part of speech is an adjective part of speech; and if the part of speech is the part of speech of the adverb, adding a part of speech tag to the field. And if the part of speech is a character, adding a character tag to the field.
Optionally, each field may correspond to one or more part-of-speech tags.
Optionally, the instruction text is "eight to ten am alarm clock tomorrow. For example, FIG. 5 shows a possible word segmentation result. Wherein, the 'point' comprises two part-of-speech tags which are time words or verbs. And adding a part-of-speech tag to each field, so that the fields can be distinguished.
Optionally, after the instruction text is obtained, the chinese character numbers in the obtained instruction text may be converted into arabic numbers, but not limited thereto. For example, eight points are converted to 8 points, etc.
Optionally, the determining the time segmentation subset from the instruction segmentation set according to the part-of-speech tag may be, but is not limited to: the method comprises the steps of obtaining instruction participles with part-of-speech labels as digital words from an instruction participle set, adding the instruction participles with the part-of-speech labels as the digital words into a time participle subset, wherein the instruction participles with the part-of-speech labels as the digital words contain effective digital information.
Optionally, the instruction participles whose part-of-speech tags indicate a number word are obtained, and the instruction participles added to the time participle subset may be, but not limited to, obtained and added to the time participle subset when at least one part-of-speech tag is a part-of-speech of a number word.
For example, the instruction text is continuously combined with an alarm clock of eight to ten am tomorrow. "is explained. After the conversion of the Chinese characters and the word segmentation as shown in fig. 5, the "tomorrow", "morning", "8", "point", "arrival", "10", "point", "alarm clock" can be obtained. And a plurality of instruction participles are added into the time participle subset, wherein the part of speech of the instruction participles is a time word. And the position relation among the time segmentation words is passed.
Optionally, after the time segmentation set is obtained, a time segmentation combination and an independent time segmentation can be determined according to, but not limited to, a position relationship between the instruction segmentation words.
Optionally, the determining of the time segmentation combination and the independent time segmentation according to the position relationship between the instruction segmentation words may be, but is not limited to: combining at least two time participles in the time participle subset under the condition that the positions of the at least two time participles are continuous positions and the at least two time participles belong to different time units to obtain the time participle combination; and under the condition that the position of the time segmentation word in the time segmentation word subset is a discrete position, taking the time segmentation word at the discrete position as the independent time segmentation word.
Alternatively, the time unit may be a time hierarchy of time segments in which each part of speech tag is a time word. For example, the temporal hierarchy may be century, year, quarter, month, ten days, week, day, half day, hour, minute, second, and so forth. For example, tomorrow has a time hierarchy of days.
Optionally, after the independent time segmentation and the time segmentation combination are obtained, the obtained time segmentation combination may be, but is not limited to, matched with the independent time segmentation and the time type template. Thereby obtaining the time type.
Optionally, the time type templates may be stored in a time type template set, and the time type templates in the time type template set may be added, deleted, checked and modified at any time.
Optionally, when the time type template is a phase time template, and when a phase time pair matched with the phase time template is included in the time segmentation subset, determining that the time type corresponding to the time segmentation subset includes the phase time type, where the phase time pair includes one of: the time word segmentation combination and the time word segmentation combination, the independent time word segmentation and the independent time word segmentation, and the time word segmentation combination and the independent time word segmentation.
Alternatively, the phase time template may be a template for identifying a period of time. For example, points to points, months to months, and the like.
Optionally, the determining the time-segmented word subset according to the part-of-speech tag includes: determining a first instruction word segmentation indicated as a time label by the part of speech label from the instruction word segmentation set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset; determining a second instruction participle indicated as a digital label by the part of speech label from the instruction participle set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
Alternatively, some time segmentation may be missed after segmenting the instruction text. For example, the instruction text is "8 hours". Since in the word segmentation, "8", "number", "hour" may be obtained. Thus, hours are labeled as time words and 8 are labeled as number words. Therefore, it is necessary to preset a plurality of basic time templates so that "8 hours" can be obtained without omission when the above-described situation is encountered.
Optionally, after the determining a time segmentation subset from the instruction segmentation set according to the part-of-speech tag, the method further includes: matching the time segmentation subsets with a repetition time template when the basic time template comprises the repetition time template; and under the condition that the time segmentation subsets comprise repeated time segmentation matched with the repeated time template, determining that the time types corresponding to the time segmentation subsets comprise repeated time types.
Alternatively, the above-mentioned repetition time type may be, but is not limited to, a time type for representing a plurality of time points or time periods. For example, every weekday, every noon 12 pm, etc.
Optionally, when the time segments in the time segment subset are transformed according to the time type, but not limited to, completing the time information corresponding to the default time unit in the time segment combination and the independent time segment to obtain a complete time field; and converting the complete time field according to the time type to obtain the target time field.
Optionally, the time information corresponding to the time unit default for completion may be, but not limited to, a small time unit default for completion and a large time unit default for completion.
Optionally, the small time unit may be, but is not limited to, a time unit having a time hierarchy smaller than a minimum time hierarchy of the acquired time segmentation word combinations or independent time segmentation words, and the large time unit may be, but is not limited to, a time unit having a time hierarchy larger than a maximum time hierarchy of the acquired time segmentation word combinations or independent time segmentation words.
Optionally, when the small time unit and the large time unit are complemented, the method may further include, but is not limited to: and determining the precedence relationship between the time in the acquired instruction text and the current time, and completing the default time unit according to the precedence relationship.
Alternatively, a special symbol may be used, but not limited to, as an interval symbol when complementing the default time unit and converting.
Alternatively, the special symbol may be, but is not limited to, a letter or a character.
For example, taking the instruction text "alarm clock at eight to ten am tomorrow" as an example, after the target time field is acquired at 8 to 10 o' clock, the current time is acquired, for example, 6/1 day in 2018. The completion time is "8: 00am at 6 months and 2 days in 2018-10: 00 am". After obtaining the above time, the above time can be converted to "2018.06.028: 00am-10:00 am".
According to the embodiment, after the instruction text matched with the query instruction is obtained, word segmentation labeling processing is performed on the instruction text to obtain an instruction word segmentation set, a time word segmentation subset is obtained from instruction word segmentation combination according to the labeling operation, the time word segmentation subset is further matched with a time type template after the time word segmentation subset is determined, and time words are converted according to the time type obtained through matching to obtain the target time field. Therefore, the accuracy of extracting the target time field is improved.
As an alternative embodiment, the matching the time segmentation subset with a pre-configured time type template to obtain a time type matching the time segmentation included in the time segmentation subset includes:
s1, acquiring the position relation among the time participles in the time participle subset;
s2, determining the time word segmentation combination and the independent time word segmentation contained in the time word segmentation subset by using the position relation;
s3, matching the time word combination and the independent time word in the time word subset with the time type template to obtain the time type.
For example, the instruction text is continuously combined with an alarm clock of eight to ten am tomorrow. "is explained. After the conversion of the Chinese characters and the word segmentation as shown in fig. 5, the "tomorrow", "morning", "8", "point", "arrival", "10", "point", "alarm clock" can be obtained. And a plurality of instruction participles are added into the time participle subset, wherein the part of speech of the instruction participles is a time word.
According to the embodiment, the time segmentation combination and the independent time segmentation in the time segmentation subset are determined according to the position relation by obtaining the position relation among the time segmentation in the time segmentation subset, so that the time type obtaining efficiency is improved.
As an optional implementation, the determining, by using the position relationship, a time segmentation combination and an independent time segmentation included in the time segmentation subset includes:
s1, merging at least two time segments in the time segment subset when the positions of the at least two time segments are continuous positions and the at least two time segments belong to different time units, so as to obtain the time segment combination;
s2, when the position of the time segmentation word in the time segmentation word subset is a discrete position, the time segmentation word at the discrete position is used as the independent time segmentation word.
Alternatively, the time unit may be a time hierarchy of time segments in which each part of speech tag is a time word. For example, the temporal hierarchy may be century, year, quarter, month, ten days, week, day, half day, hour, minute, second, and so forth. For example, tomorrow has a time hierarchy of days.
For example, continue with the above instruction text as "eight to ten am alarm clock tomorrow. "is described as an example. As shown in fig. 6, after the instruction text is converted and participled, a time participle set is obtained, and the time participle set includes a plurality of time participles such as "tomorrow", "morning", "8", "spot", "10", "spot", and the like. The positions of a plurality of time segments such as "tomorrow", "morning", "8", "point" and the like are connected, and the time units between the time segments are different. Therefore, the "tomorrow", "morning", "8" and "spot" may be combined to obtain "tomorrow at 8 am". And "10" and "dots" can be merged into "10 dots".
According to the embodiment, whether the time segmentation words are combined into the time segmentation word combination or the time segmentation words are used as the independent time segmentation words is judged according to the position continuity, so that the division efficiency of dividing the time segmentation words is improved, and the efficiency of acquiring the independent time segmentation words and the time segmentation word set is further improved.
As an optional implementation, the matching the time segmentation combinations and the independent time segmentation in the time segmentation subset with the time type template to obtain the time types includes:
s1, matching the time participle combinations and the independent time participles in the time participle subset with the stage time template when the time type template includes the stage time template;
s2, under the condition that the time participle subset comprises the phase time pair matched with the phase time template, determining that the time type corresponding to the time participle subset comprises a phase time type, wherein the phase time pair comprises one of the following: the time word combination is combined with the time word combination to form a time pair, the independent time word is combined with the independent time word to form a time pair, and the time word combination is combined with the independent time word to form a time pair.
Alternatively, the phase time template may be a template for identifying a period of time. For example, points to points, months to months, and the like.
For example, continue with the above instruction text as "eight to ten am alarm clock tomorrow. ", the above time templates are illustrated as" + -. dot to "+ -. dot" as an example. And after a plurality of time participles such as tomorrow, morning, 8, point, 10 and point in the time participle set are obtained, a time participle combination and an independent time participle are obtained, and then the time participles are matched with a preset phase time template. The time type is obtained.
As shown in fig. 7, after acquiring days 3/6 to 3/8, time patterns from day 3/6 to day 3/8 can be acquired by matching with predetermined time templates.
According to the embodiment, under the condition that the time type template comprises the stage time template, the time word set is matched with the independent time word and the time type template, so that the time type is obtained, the target time field is further obtained according to the time type, and the accuracy of obtaining the target time field is improved.
As an alternative embodiment, the determining a time segmentation subset from the instruction segmentation set according to the part-of-speech tag includes:
s1, determining a first instruction word segmentation indicated as a time label by the part of speech label from the instruction word segmentation set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset;
s2, determining a second instruction participle with part-of-speech labels as digital labels from the instruction participle set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
Alternatively, the second instruction participle may be an instruction participle with part-of-speech tags as numbers.
Alternatively, some time segmentation may be missed after segmenting the instruction text. For example 8 hours. Since in the word segmentation, "8", "number", "hour" may be obtained. Thus, hours are labeled as time words and 8 are labeled as number words. Therefore, it is necessary to preset a plurality of basic time templates so that "8 hours" can be obtained without omission when the above-described situation is encountered.
According to the embodiment, the first instruction label with the part-of-speech label as the time label is determined in the instruction word segmentation set, the second instruction word with the part-of-speech label as the digital label is determined in the instruction word segmentation set, and the time word is obtained by the first instruction word segmentation and the second instruction word segmentation, so that the efficiency and the accuracy of obtaining the time word are improved.
As an alternative embodiment, the matching with the basic time template using the second instruction participle includes:
s1, matching the second instruction participle with the basic time template under the condition that the number contained in the second instruction participle is an integer type, wherein the basic time template comprises time unit fields matched with different time units.
For example, the description is continued with the case where the instruction text is "8 hours". And after the instruction text is obtained, extracting a second instruction participle '8' in the instruction text. Since "8" is an integer, the instruction participle is compared to the base time template to match to the time cell field.
Through the embodiment, under the condition that the second instruction participle is in an integer type, the second instruction participle is matched with the basic time template, and most of the decimal numbers represent the mathematic sub-numbers and do not represent time. Therefore, the method improves the efficiency of acquiring the time unit field of the digital type.
As an optional implementation, after determining the time segmentation subset from the instruction segmentation set according to the part-of-speech tag, the method further includes:
s1, matching the time word segmentation subsets with the repeated time templates under the condition that the basic time templates comprise the repeated time templates;
s2, under the condition that the time participle subset comprises the repeated time participle matched with the repeated time template, determining that the time type corresponding to the time participle subset comprises a repeated time type.
Alternatively, the above-mentioned repetition time type may be, but is not limited to, a time type for representing a plurality of time points or time periods. For example, every weekday, every noon 12 pm, etc.
For example, taking the instruction text as "12 pm per day" and the repeated time template as "a dot per day" as an example, after the instruction text is acquired and the time segmentation word subsets "day", "pm", "12", "dot" are acquired, a time segmentation word combination is obtained and matched with "a dot per day", and the type of the time segmentation word in the time segmentation word subset is obtained as a repeated time type.
For example, taking the instruction text as 12 pm every day as an example, as shown in fig. 8, 12 pm every day is marked on the time axis, thereby indicating that 12 pm every day falls within the range of the instruction text.
According to the embodiment, the time word segmentation subsets are matched with the repetition time template under the condition that the repetition time template is obtained, so that the efficiency and the accuracy of obtaining the repetition time type are improved.
As an optional implementation, the matching the time segmentation combinations and the independent time segmentation in the time segmentation subset with the time type template to obtain the time types includes:
s1, determining the time participle combination and the independent time participle as a specified time when a target condition is satisfied, where the time type corresponding to the time participle subset includes a specified time type, where the target condition includes one of: the matching is performed without including a stage time template and a repetition time template, the matching is performed without including a stage time pair matching the stage time template in the time segmentation subset and without including a repetition time segmentation matching the repetition time template, and the matching is performed in the time segmentation combination and the independent time segmentation except for the stage time pair matching the stage time template and the repetition time segmentation matching the repetition time template in the time segmentation subset.
For example, a plurality of time templates are pre-stored, and the time templates do not include the phase time template and the repetition time template. And matching the acquired time segmentation combination or the independent time segmentation with a time template so as to determine whether the time segmentation combination and the independent time segmentation are determined as the designated time.
According to the embodiment, the time segmentation combination and the independent time segmentation are determined to be the designated time under the condition that the condition is met, so that the time type matching efficiency is improved.
As an optional implementation, before the determining, according to the part-of-speech tag, a subset of time segments from the instruction segment set, the method further includes:
s1, determining target instruction participles carrying effective digital information from the instruction participle set by using the part-of-speech tags;
and S2, extracting a target number matched with the effective digital information according to the position relation among the target instruction word segmentation, wherein the target number is a number allowing machine identification.
For example, taking the instruction text as "airplane takes off at 2018-6-612: 00: 00", after the shanghai su instruction text is obtained, the target instruction word segmentation of 2018-6-612: 00:00 can be directly extracted, and the target number is obtained for the machine to read.
According to the embodiment, the target instruction word segmentation is determined from the instruction word segmentation set according to the part-of-speech tag, and the target number is acquired according to the target instruction word segmentation, so that the efficiency of acquiring the target number is improved, and further, the accuracy and the efficiency of acquiring the target time field are improved.
As an optional implementation, the converting the time segments in the time segment subset according to the time type to obtain a target time field extracted from the instruction text includes:
s1, complementing the time information corresponding to the default time unit in the time participle combination and the independent time participle to obtain a complete time field;
and S2, converting the complete time field according to the time type to obtain the target time field.
For example, taking the instruction text "alarm clock at eight to ten am tomorrow" as an example, after the target time field is acquired at 8 to 10 o' clock, the current time is acquired, for example, 6/1 day in 2018. The completion time is "8: 00am at 6 months and 2 days in 2018-10: 00 am". After obtaining the above time, the above time can be converted to "2018.06.028: 00am-10:00 am".
According to the embodiment, the complete time field is obtained by complementing the time information corresponding to the default time unit in the time segmentation combination, so that the target time field obtained after conversion is complemented, and the time integrity of the target time field is improved.
As an optional implementation, the completing the time information corresponding to the default time unit in the time segmentation combination and the independent time segmentation to obtain a complete time field includes:
s1, performing semantic analysis on the instruction text;
s2, determining the time information to be completed according to the result of semantic analysis;
and S3, complementing the time word combination and the independent time word by using the time information to obtain the complete time field.
For example, continuing to combine that the instruction text is "an alarm clock at eight to ten am tomorrow", taking the current time as 2018-6-1 as an example, after the target time field is acquired at 8 to 10 am tomorrow, analyzing the instruction text to obtain the time including day and hour in the instruction text. The time of year, month, minute and second is complemented. Complementing "eight am to ten am in tomorrow" as "6 month and 2 month in 2018, 8:00:00-10:00: 00"
According to the embodiment, the instruction text is subjected to semantic analysis, and the time information to be completed is determined according to the semantic analysis result, so that the complete time field is obtained, and the efficiency of obtaining the complete time field is improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
According to another aspect of the embodiment of the present invention, there is also provided a time extraction apparatus for implementing the above time extraction method. As shown in fig. 9, the apparatus includes:
(1) an obtaining unit 902, configured to obtain an instruction text matching an input query instruction;
(2) a processing unit 904, configured to perform word segmentation tagging processing on the instruction text to obtain an instruction word segmentation set, where each instruction word in the instruction word segmentation set is configured with a part-of-speech tag;
(3) a first determining unit 906, configured to determine a time segmentation subset from the instruction segmentation set according to the part of speech tag;
(4) a matching unit 908, configured to match the time segmentation subsets with a preconfigured time type template, so as to obtain time types matched with the time segmentation included in the time segmentation subsets;
(5) a converting unit 910, configured to convert the time segmentation words in the time segmentation word subset according to the time type to obtain a target time field extracted from the instruction text, where the target time field allows machine identification.
Optionally, the time extraction method may be applied to, but not limited to, a scene in which time information is extracted from a text after the text is acquired. For example, during the process of extracting the news occurrence time, or during the process of extracting the time in the memo.
Alternatively, the time extraction method may be applied to, but not limited to, an application installed on the terminal, for example, an application such as a pay pal.
For example, the following description will be made with the above-described time extraction method applied to a process of extracting news occurrence time. The news is recorded with the time when the news occurred. After the instruction text is obtained, word segmentation labeling processing is carried out on the instruction text to obtain an instruction word segmentation set, and a time word segmentation subset is determined from instruction word segmentation combination according to the part-of-speech tag of each instruction word. And matching the time word subsets with the time type template to obtain time types matched with the time words, and converting the time words according to the time types to obtain target time fields for machine identification.
In the method, the instruction text is subjected to word segmentation labeling processing to obtain an instruction word segmentation set, and each instruction word segmentation is configured with a part-of-speech tag in the instruction word segmentation set, so that a time word segmentation subset can be determined. And matching the time word subsets according to the time type template to obtain time types, and converting the time words in the time word subsets according to the time types to obtain target time fields. Therefore, the target time field can be accurately and efficiently extracted, and the efficiency of extracting the target time field is improved.
Optionally, the obtaining of the instruction text matching the input query instruction may be, but is not limited to, by the following means:
(1) and displaying an input box on a display interface of the terminal, and taking the received content as the instruction text when receiving the content input by the input box.
For example, an input box is displayed on a display interface of the terminal, and after a character of "eight to ten alarm clocks in tomorrow am" is input in the input box, the character of "eight to ten alarm clocks in tomorrow am" is used as an instruction text.
(2) Receiving a picture carrying an instruction text, identifying character information from the picture, and taking the identified character information as the instruction text.
For example, the terminal receives a picture carrying a word of "eight to ten alarm clocks in tomorrow am", identifies the picture, identifies the word of "eight to ten alarm clocks in tomorrow am", collects the words, and uses the collected "eight to ten alarm clocks in tomorrow am" as an instruction text.
(3) And after receiving the selection instruction, taking the selected text as an instruction text.
Alternatively, the receiving of the selected command may be, but is not limited to, pressing a button of a display interface of the terminal, receiving a voice command input by a user, and the like.
For example, a button and text content may be displayed on the display interface of the terminal. And when the button is pressed, taking the character content selected by the user as an instruction text, and executing a subsequent digit extraction process.
(4) And acquiring voice input information, and taking the acquired voice input information as an instruction text.
For example, voice input information input by a user, such as "an alarm clock at eight to ten am tomorrow", is received, and the obtained voice information is converted into text information and used as an instruction text.
Optionally, the word segmentation tagging processing performed on the instruction text may be, but is not limited to, splitting the obtained instruction text into a plurality of separate fields, and adding a part-of-speech tag to each field.
Optionally, the above-mentioned adding part-of-speech tag to each field may be, but is not limited to, determining the part-of-speech of each field. When the part of speech is a noun part of speech, adding a noun part of speech tag to the field; adding a part-of-speech tag of a digit to the field when the part-of-speech is the part-of-speech of the digit; if the part of speech is verb part of speech, adding a verb part of speech tag to the field; adding an adjective label to the field under the condition that the part of speech is an adjective part of speech; and if the part of speech is the part of speech of the adverb, adding a part of speech tag to the field. And if the part of speech is a character, adding a character tag to the field.
Optionally, each field may correspond to one or more part-of-speech tags.
Optionally, the instruction text is "eight to ten am alarm clock tomorrow. For example, FIG. 5 shows a possible word segmentation result. Wherein, the 'point' comprises two part-of-speech tags which are time words or verbs. And adding a part-of-speech tag to each field, so that the fields can be distinguished.
Optionally, after the instruction text is obtained, the chinese character numbers in the obtained instruction text may be converted into arabic numbers, but not limited thereto. For example, eight points are converted to 8 points, etc.
Optionally, the determining the time segmentation subset from the instruction segmentation set according to the part-of-speech tag may be, but is not limited to: the method comprises the steps of obtaining instruction participles with part-of-speech labels as digital words from an instruction participle set, adding the instruction participles with the part-of-speech labels as the digital words into a time participle subset, wherein the instruction participles with the part-of-speech labels as the digital words contain effective digital information.
Optionally, the instruction participles whose part-of-speech tags indicate a number word are obtained, and the instruction participles added to the time participle subset may be, but not limited to, obtained and added to the time participle subset when at least one part-of-speech tag is a part-of-speech of a number word.
For example, the instruction text is continuously combined with an alarm clock of eight to ten am tomorrow. "is explained. After the conversion of the Chinese characters and the word segmentation as shown in fig. 5, the "tomorrow", "morning", "8", "point", "arrival", "10", "point", "alarm clock" can be obtained. And a plurality of instruction participles are added into the time participle subset, wherein the part of speech of the instruction participles is a time word. And the position relation among the time segmentation words is passed.
Optionally, after the time segmentation set is obtained, a time segmentation combination and an independent time segmentation can be determined according to, but not limited to, a position relationship between the instruction segmentation words.
Optionally, the determining of the time segmentation combination and the independent time segmentation according to the position relationship between the instruction segmentation words may be, but is not limited to: combining at least two time participles in the time participle subset under the condition that the positions of the at least two time participles are continuous positions and the at least two time participles belong to different time units to obtain the time participle combination; and under the condition that the position of the time segmentation word in the time segmentation word subset is a discrete position, taking the time segmentation word at the discrete position as the independent time segmentation word.
Alternatively, the time unit may be a time hierarchy of time segments in which each part of speech tag is a time word. For example, the temporal hierarchy may be century, year, quarter, month, ten days, week, day, half day, hour, minute, second, and so forth. For example, tomorrow has a time hierarchy of days.
Optionally, after the independent time segmentation and the time segmentation combination are obtained, the obtained time segmentation combination may be, but is not limited to, matched with the independent time segmentation and the time type template. Thereby obtaining the time type.
Optionally, the time type templates may be stored in a time type template set, and the time type templates in the time type template set may be added, deleted, checked and modified at any time.
Optionally, when the time type template is a phase time template, and when a phase time pair matched with the phase time template is included in the time segmentation subset, determining that the time type corresponding to the time segmentation subset includes the phase time type, where the phase time pair includes one of: the time word segmentation combination and the time word segmentation combination, the independent time word segmentation and the independent time word segmentation, and the time word segmentation combination and the independent time word segmentation.
Alternatively, the phase time template may be a template for identifying a period of time. For example, points to points, months to months, and the like.
Optionally, the determining the time-segmented word subset according to the part-of-speech tag includes: determining a first instruction word segmentation indicated as a time label by the part of speech label from the instruction word segmentation set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset; determining a second instruction participle indicated as a digital label by the part of speech label from the instruction participle set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
Alternatively, the second instruction participle may be an instruction participle with part-of-speech tags as numbers.
Optionally, after the determining a time segmentation subset from the instruction segmentation set according to the part-of-speech tag, the method further includes: matching the time segmentation subsets with a repetition time template when the basic time template comprises the repetition time template; and under the condition that the time segmentation subsets comprise repeated time segmentation matched with the repeated time template, determining that the time types corresponding to the time segmentation subsets comprise repeated time types.
Alternatively, the above-mentioned repetition time type may be, but is not limited to, a time type for representing a plurality of time points or time periods. For example, every weekday, every noon 12 pm, etc.
Optionally, when the time segments in the time segment subset are transformed according to the time type, but not limited to, completing the time information corresponding to the default time unit in the time segment combination and the independent time segment to obtain a complete time field; and converting the complete time field according to the time type to obtain the target time field.
Optionally, the time information corresponding to the time unit default for completion may be, but not limited to, a small time unit default for completion and a large time unit default for completion.
Optionally, the small time unit may be, but is not limited to, a time unit having a time hierarchy smaller than a minimum time hierarchy of the acquired time segmentation word combinations or independent time segmentation words, and the large time unit may be, but is not limited to, a time unit having a time hierarchy larger than a maximum time hierarchy of the acquired time segmentation word combinations or independent time segmentation words.
Optionally, when the small time unit and the large time unit are complemented, the method may further include, but is not limited to: and determining the precedence relationship between the time in the acquired instruction text and the current time, and completing the default time unit according to the precedence relationship.
Alternatively, a special symbol may be used, but not limited to, as an interval symbol when complementing the default time unit and converting.
Alternatively, the special symbol may be, but is not limited to, a letter or a character.
For example, taking the instruction text "alarm clock at eight to ten am tomorrow" as an example, after the target time field is acquired at 8 to 10 o' clock, the current time is acquired, for example, 6/1 day in 2018. The completion time is "8: 00am at 6 months and 2 days in 2018-10: 00 am". After obtaining the above time, the above time can be converted to "2018.06.028: 00am-10:00 am".
According to the embodiment, after the instruction text matched with the query instruction is obtained, word segmentation labeling processing is performed on the instruction text to obtain an instruction word segmentation set, a time word segmentation subset is obtained from instruction word segmentation combination according to the labeling operation, the time word segmentation subset is further matched with a time type template after the time word segmentation subset is determined, and time words are converted according to the time type obtained through matching to obtain the target time field. Therefore, the accuracy of extracting the target time field is improved.
As an alternative embodiment, the matching unit comprises:
(1) an obtaining module, configured to obtain a position relationship between the time segments in the time segment subset;
(2) the first determining module is used for determining the time word segmentation combination and the independent time word segmentation contained in the time word segmentation subset by utilizing the position relation;
(3) and the first matching module is used for matching the time word combination and the independent time word in the time word subset with the time type template to obtain the time type.
For example, the instruction text is continuously combined with an alarm clock of eight to ten am tomorrow. "is explained. After the conversion of the Chinese characters and the word segmentation as shown in fig. 5, the "tomorrow", "morning", "8", "point", "arrival", "10", "point", "alarm clock" can be obtained. And a plurality of instruction participles are added into the time participle subset, wherein the part of speech of the instruction participles is a time word.
According to the embodiment, the time segmentation combination and the independent time segmentation in the time segmentation subset are determined according to the position relation by obtaining the position relation among the time segmentation in the time segmentation subset, so that the time type obtaining efficiency is improved.
As an alternative implementation, the first determining module includes:
(1) a merging submodule, configured to merge at least two time segmentations in the time segmentation subset when positions of the at least two time segmentations are continuous positions and the at least two time segmentations belong to different time units, so as to obtain the time segmentation combination;
(2) a first determining sub-module, configured to, when a position where the time segmentation word is located in the time segmentation word subset is a discrete position, take the time segmentation word at the discrete position as the independent time segmentation word.
Alternatively, the time unit may be a time hierarchy of time segments in which each part of speech tag is a time word. For example, the temporal hierarchy may be century, year, quarter, month, ten days, week, day, half day, hour, minute, second, and so forth. For example, tomorrow has a time hierarchy of days.
According to the embodiment, whether the time segmentation words are combined into the time segmentation word combination or the time segmentation words are used as the independent time segmentation words is judged according to the position continuity, so that the division efficiency of dividing the time segmentation words is improved, and the efficiency of acquiring the independent time segmentation words and the time segmentation word set is further improved.
As an alternative embodiment, the first matching module comprises:
(1) a first matching sub-module, configured to match the time segmentation combinations and the independent time segments in the time segmentation subset with the stage time template when the time type template includes the stage time template;
(2) a second determining sub-module, configured to determine that a phase time type is included in the time types corresponding to the time participle subsets when the phase time pairs matched with the phase time template are included in the time participle subsets, where the phase time pairs include one of: the time word combination is combined with the time word combination to form a time pair, the independent time word is combined with the independent time word to form a time pair, and the time word combination is combined with the independent time word to form a time pair.
Alternatively, the phase time template may be a template for identifying a period of time. For example, points to points, months to months, and the like.
According to the embodiment, under the condition that the time type template comprises the stage time template, the time word set is matched with the independent time word and the time type template, so that the time type is obtained, the target time field is further obtained according to the time type, and the accuracy of obtaining the target time field is improved.
As an alternative embodiment, the first determining unit includes:
(1) the second determining module is used for determining the first instruction participle of which the part of speech label is indicated as a time label from the instruction participle set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset;
(2) the third determining module is used for determining a second instruction word segmentation indicated as a digital label by the part of speech label from the instruction word segmentation set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
Alternatively, the second instruction participle may be an instruction participle with part-of-speech tags as numbers.
According to the embodiment, the first instruction label with the part-of-speech label as the time label is determined in the instruction word segmentation set, the second instruction word with the part-of-speech label as the digital label is determined in the instruction word segmentation set, and the time word is obtained by the first instruction word segmentation and the second instruction word segmentation, so that the efficiency and the accuracy of obtaining the time word are improved.
As an alternative implementation, the third determining module includes:
(1) and the second matching submodule is used for matching the second instruction participle with the basic time template under the condition that the number contained in the second instruction participle is an integer type, wherein the basic time template comprises time unit fields matched with different time units.
For example, the description is continued with the case where the instruction text is "8 hours". And after the instruction text is obtained, extracting a second instruction participle '8' in the instruction text. Since "8" is an integer, the instruction participle is compared to the base time template to match to the time cell field.
Through the embodiment, under the condition that the second instruction participle is in an integer type, the second instruction participle is matched with the basic time template, and most of the decimal numbers represent the mathematic sub-numbers and do not represent time. Therefore, the method improves the efficiency of acquiring the time unit field of the digital type.
As an optional implementation, the first determining unit further includes:
(1) a second matching module, configured to match the time segmentation subsets with the repetition time template when the basic time template includes the repetition time template after the time segmentation subsets are determined from the instruction segmentation set according to the part-of-speech tags;
(2) and the fourth determining module is used for determining that the time type corresponding to the time participle subset comprises a repetition time type under the condition that the time participle subset comprises the repetition time participle matched with the repetition time template.
Alternatively, the above-mentioned repetition time type may be, but is not limited to, a time type for representing a plurality of time points or time periods. For example, every weekday, every noon 12 pm, etc.
According to the embodiment, the time word segmentation subsets are matched with the repetition time template under the condition that the repetition time template is obtained, so that the efficiency and the accuracy of obtaining the repetition time type are improved.
As an alternative embodiment, the first matching module comprises:
(1) a third determining sub-module, configured to determine, when a target condition is met, that the time-segmented-word combination and the independent time segmented words are designated times, where the time types corresponding to the time-segmented-word subsets include a designated time type, where the target condition includes one of: the matching is performed without including a stage time template and a repetition time template, the matching is performed without including a stage time pair matching the stage time template in the time segmentation subset and without including a repetition time segmentation matching the repetition time template, and the matching is performed in the time segmentation combination and the independent time segmentation except for the stage time pair matching the stage time template and the repetition time segmentation matching the repetition time template in the time segmentation subset.
For example, a plurality of time templates are pre-stored, and the time templates do not include the phase time template and the repetition time template. And matching the acquired time segmentation combination or the independent time segmentation with a time template so as to determine whether the time segmentation combination and the independent time segmentation are determined as the designated time.
According to the embodiment, the time segmentation combination and the independent time segmentation are determined to be the designated time under the condition that the condition is met, so that the time type matching efficiency is improved.
As an alternative embodiment, the above apparatus further comprises:
(1) a second determining unit, configured to determine, before the time word segmentation subset is determined from the instruction word segmentation set according to the part-of-speech tag, a target instruction word segmentation carrying valid digital information from the instruction word segmentation set by using the part-of-speech tag;
(2) and the extraction unit is used for extracting a target number matched with the effective digital information according to the position relation among the target instruction word segmentation, wherein the target number is a number allowing machine identification.
According to the embodiment, the target instruction word segmentation is determined from the instruction word segmentation set according to the part-of-speech tag, and the target number is obtained according to the target instruction word segmentation, so that the efficiency of obtaining the target number is improved, and the accuracy and the efficiency of obtaining the target time field are further improved.
As an alternative embodiment the above conversion unit comprises:
(1) the completion module is used for completing the time information corresponding to the default time unit in the time participle combination and the independent time participle to obtain a complete time field;
(2) and the conversion module is used for converting the complete time field according to the time type to obtain the target time field.
According to the embodiment, the complete time field is obtained by complementing the time information corresponding to the default time unit in the time segmentation combination, so that the target time field obtained after conversion is complemented, and the time integrity of the target time field is improved.
As an alternative embodiment, the completion module includes:
(1) the analysis submodule is used for performing semantic analysis on the instruction text;
(2) the fourth determining submodule is used for determining the time information to be supplemented according to the result of the semantic analysis;
(3) and the completion submodule is used for completing the time participle combination and the independent time participle by utilizing the time information to obtain the complete time field.
According to the embodiment, the instruction text is subjected to semantic analysis, and the time information to be completed is determined according to the semantic analysis result, so that the complete time field is obtained, and the efficiency of obtaining the complete time field is improved.
According to yet another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above time extraction method, as shown in fig. 10, the electronic device includes a memory 1002 and a processor 1004, the memory 1002 stores therein a computer program, and the processor 1004 is configured to execute the steps in any one of the above method embodiments through the computer program.
Optionally, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring an instruction text matched with the input query instruction;
s2, performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag;
s3, determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag;
s4, matching the time participle subset with a pre-configured time type template to obtain a time type matched with the time participles contained in the time participle subset;
s5, converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine recognition.
Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be used to store software programs and modules, such as program instructions/modules corresponding to the time extraction method and apparatus in the embodiments of the present invention, and the processor 1004 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002, that is, the time extraction method described above is implemented. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be, but not limited to, specifically configured to store instruction text, a target time field, and other information. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, the obtaining unit 902, the processing unit 904, the first determining unit 906, the matching unit 908, and the converting unit 910 of the time extracting apparatus. In addition, the time extraction device may further include, but is not limited to, other module units in the time extraction device, which is not described in detail in this example.
Optionally, the above-mentioned transmission device 1006 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1006 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1006 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In addition, the electronic device further includes: a display 1008 for displaying the contents of the target time field, etc.; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.
According to a further aspect of embodiments of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring an instruction text matched with the input query instruction;
s2, performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag;
s3, determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag;
s4, matching the time participle subset with a pre-configured time type template to obtain a time type matched with the time participles contained in the time participle subset;
s5, converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine recognition.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring the position relation among the time participles in the time participle subset;
s2, determining the time word segmentation combination and the independent time word segmentation contained in the time word segmentation subset by using the position relation;
s3, matching the time word combination and the independent time word in the time word subset with the time type template to obtain the time type.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, merging at least two time segments in the time segment subset when the positions of the at least two time segments are continuous positions and the at least two time segments belong to different time units, so as to obtain the time segment combination;
s2, when the position of the time segmentation word in the time segmentation word subset is a discrete position, the time segmentation word at the discrete position is used as the independent time segmentation word.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, matching the time participle combinations and the independent time participles in the time participle subset with the stage time template when the time type template includes the stage time template;
s2, under the condition that the time participle subset comprises the phase time pair matched with the phase time template, determining that the time type corresponding to the time participle subset comprises a phase time type, wherein the phase time pair comprises one of the following: the time word segmentation combination and the time word segmentation combination, the independent time word segmentation and the independent time word segmentation, and the time word segmentation combination and the independent time word segmentation.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, determining a first instruction word segmentation indicated as a time label by the part of speech label from the instruction word segmentation set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset;
s2, determining a second instruction participle with part-of-speech labels as digital labels from the instruction participle set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, matching the second instruction participle with the basic time template under the condition that the number contained in the second instruction participle is an integer type, wherein the basic time template comprises time unit fields matched with different time units.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, matching the time word segmentation subsets with the repeated time templates under the condition that the basic time templates comprise the repeated time templates;
s2, under the condition that the time participle subset comprises the repeated time participle matched with the repeated time template, determining that the time type corresponding to the time participle subset comprises a repeated time type.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, determining the time participle combination and the independent time participle as a specified time when a target condition is satisfied, where the time type corresponding to the time participle subset includes a specified time type, where the target condition includes one of: the matching is performed without including a stage time template and a repetition time template, the matching is performed without including a stage time pair matching the stage time template in the time segmentation subset and without including a repetition time segmentation matching the repetition time template, and the matching is performed in the time segmentation combination and the independent time segmentation except for the stage time pair matching the stage time template and the repetition time segmentation matching the repetition time template in the time segmentation subset.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, determining target instruction participles carrying effective digital information from the instruction participle set by using the part-of-speech tags;
and S2, extracting a target number matched with the effective digital information according to the position relation among the target instruction word segmentation, wherein the target number is a number allowing machine identification.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, complementing the time information corresponding to the default time unit in the time participle combination and the independent time participle to obtain a complete time field;
and S2, converting the complete time field according to the time type to obtain the target time field.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, performing semantic analysis on the instruction text;
s2, determining the time information to be completed according to the result of semantic analysis;
and S3, complementing the time word combination and the independent time word by using the time information to obtain the complete time field.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (17)

1. A method of time extraction, comprising:
acquiring an instruction text matched with an input query instruction;
performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag;
determining a time word segmentation subset from the instruction word segmentation set according to the part of speech tag;
matching the time segmentation words in the time segmentation word subset with a time type template to obtain a time type;
and converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification.
2. The method of claim 1, wherein matching temporal tokens in the subset of temporal tokens to temporal type templates to derive temporal types comprises:
obtaining a position relationship between the time segments in the time segment subset;
determining a time word segmentation combination and an independent time word segmentation contained in the time word segmentation subset by using the position relation;
and matching the time word combination and the independent time word in the time word subset with the time type template to obtain the time type.
3. The method of claim 2, wherein the determining the time segment combinations and independent time segments included in the time segment subset using the position relationship comprises:
combining at least two time participles in the time participle subset under the condition that the positions of the at least two time participles are continuous positions and the at least two time participles belong to different time units to obtain the time participle combination;
and under the condition that the position of the time segmentation word in the time segmentation word subset is a discrete position, taking the time segmentation word at the discrete position as the independent time segmentation word.
4. The method of claim 2, wherein matching the temporal segment combinations and the independent temporal segments in the subset of temporal segments to the temporal type template to obtain the temporal types comprises:
matching the time segmentation combinations and the independent time segmentation in the time segmentation subsets with the stage time templates under the condition that the time type templates comprise stage time templates;
under the condition that the time segmentation subsets comprise stage time pairs matched with the stage time templates, determining that the time types corresponding to the time segmentation subsets comprise stage time types, wherein the stage time pairs comprise one of the following: the time word segmentation combination and the time word segmentation combination, the independent time word segmentation and the independent time word segmentation, and the time word segmentation combination and the independent time word segmentation.
5. The method of claim 2, wherein matching the temporal segment combinations and the independent temporal segments in the subset of temporal segments to the temporal type template to obtain the temporal types comprises:
determining the time segmentation combination and the independent time segmentation as specified time under the condition that a target condition is met, wherein the time types corresponding to the time segmentation subsets comprise specified time types, and the target condition comprises one of the following conditions: the matching is performed without including a stage time template and a repetition time template, the matching is performed without including a stage time pair matching the stage time template in the time segmentation subset and without including a repetition time segmentation matching the repetition time template, and the matching is performed in the time segmentation combination and the independent time segmentation except for the stage time pair matching the stage time template and the repetition time segmentation matching the repetition time template in the time segmentation subset.
6. The method of claim 2, wherein the transforming the temporal segments in the subset of temporal segments according to the temporal type to obtain a target time field extracted from the instruction text comprises:
completing the time information corresponding to the default time unit in the time segmentation combination and the independent time segmentation to obtain a complete time field;
and converting the complete time field according to the time type to obtain the target time field.
7. The method of claim 6, wherein completing the time information corresponding to the default time unit in the time-segmented combination and the independent time-segmented word to obtain a complete time field comprises:
performing semantic analysis on the instruction text;
determining the time information to be completed according to the result of the semantic analysis;
and completing the time word segmentation combination and the independent time word segmentation by using the time information to obtain the complete time field.
8. The method of claim 1, wherein determining a subset of temporal segments from the set of instruction segments based on the part-of-speech tag comprises:
determining a first instruction word segmentation indicated as a time label by the part of speech label from the instruction word segmentation set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset;
determining a second instruction participle indicated as a digital label by the part of speech label from the instruction participle set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
9. The method of claim 8, wherein said matching with a base time template using said second instruction participle comprises:
and matching the second instruction participle with the basic time template under the condition that the number contained in the second instruction participle is an integer type, wherein the basic time template comprises time unit fields matched with different time units.
10. The method of claim 8, after determining a subset of temporal segments from the set of instruction segments according to the part-of-speech tag, further comprising:
matching the time segmentation subsets with a repetition time template when the basic time template comprises the repetition time template;
and under the condition that the time segmentation subsets comprise repeated time segmentation matched with the repeated time template, determining that the time types corresponding to the time segmentation subsets comprise repeated time types.
11. The method of claim 1, further comprising, prior to said determining a subset of temporal segments from said set of instruction segments based on said part-of-speech tag:
determining target instruction participles carrying effective digital information from the instruction participle set by using the part-of-speech tags;
and extracting a target number matched with the effective digital information according to the position relation among the target instruction word segmentation, wherein the target number is a number allowing machine identification.
12. A time extraction device, characterized by comprising:
the acquisition unit is used for acquiring an instruction text matched with the input query instruction;
the processing unit is used for performing word segmentation labeling processing on the instruction text to obtain an instruction word segmentation set, wherein each instruction word in the instruction word segmentation set is respectively configured with a part-of-speech tag;
the first determining unit is used for determining a time participle subset from the instruction participle set according to the part of speech tag;
the matching unit is used for matching the time participles in the time participle subset with a time type template so as to obtain a time type matched with the time participles contained in the time participle subset;
and the converting unit is used for converting the time segmentation words in the time segmentation word subset according to the time types to obtain a target time field extracted from the instruction text, wherein the target time field allows machine identification.
13. The apparatus of claim 12, wherein the matching unit comprises:
an obtaining module, configured to obtain a position relationship between the time segments in the time segment subset;
the first determining module is used for determining the time word segmentation combination and the independent time word segmentation contained in the time word segmentation subset by utilizing the position relation;
and the first matching module is used for matching the time word combination and the independent time word in the time word subset with the time type template to obtain the time type.
14. The apparatus of claim 12, wherein the first determining unit comprises:
the second determining module is used for determining the first instruction participle of which the part of speech label is indicated as a time label from the instruction participle set; taking the first instruction word segmentation as the time word segmentation, and storing the time word segmentation into the time word segmentation subset;
the third determining module is used for determining a second instruction word segmentation indicated as a digital label by the part of speech label from the instruction word segmentation set; matching the second instruction participle with a basic time template; and acquiring the time participle according to the second instruction participle matched with the basic time template, and storing the time participle into the time participle subset.
15. The apparatus of claim 12, further comprising:
a second determining unit, configured to determine, before the time word segmentation subset is determined from the instruction word segmentation set according to the part-of-speech tag, a target instruction word segmentation carrying valid digital information from the instruction word segmentation set by using the part-of-speech tag;
and the extraction unit is used for extracting a target number matched with the effective digital information according to the position relation among the target instruction word segmentation, wherein the target number is a number allowing machine identification.
16. A storage medium comprising a stored program, wherein the program when executed performs the method of any one of claims 1 to 11.
17. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 11 by means of the computer program.
CN201810960868.4A 2018-08-22 2018-08-22 Time extraction method and device, storage medium and electronic device Active CN109190119B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810960868.4A CN109190119B (en) 2018-08-22 2018-08-22 Time extraction method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810960868.4A CN109190119B (en) 2018-08-22 2018-08-22 Time extraction method and device, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN109190119A CN109190119A (en) 2019-01-11
CN109190119B true CN109190119B (en) 2020-11-10

Family

ID=64919147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810960868.4A Active CN109190119B (en) 2018-08-22 2018-08-22 Time extraction method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN109190119B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027319A (en) * 2019-10-30 2020-04-17 平安科技(深圳)有限公司 Method and device for analyzing natural language time words and computer equipment
CN111581963B (en) * 2020-03-30 2022-09-20 深圳壹账通智能科技有限公司 Method and device for extracting time character string, computer equipment and storage medium
CN111639491B (en) * 2020-05-18 2024-05-03 华青融天(北京)软件股份有限公司 Time data extraction method and device and electronic equipment
CN112051985B (en) * 2020-07-23 2023-07-25 北京奇艺世纪科技有限公司 Event triggering method, device, electronic equipment and readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8302030B2 (en) * 2005-09-14 2012-10-30 Jumptap, Inc. Management of multiple advertising inventories using a monetization platform
CN101655862A (en) * 2009-08-11 2010-02-24 华天清 Method and device for searching information object
CN102831188A (en) * 2012-08-02 2012-12-19 北京百纳威尔科技有限公司 Promoting message setting method and terminal
CN105224601B (en) * 2015-08-31 2018-09-04 小米科技有限责任公司 A kind of method and apparatus of extracting time information
CN105472580B (en) * 2015-11-17 2019-08-06 小米科技有限责任公司 Processing method, device, terminal and the server of information
CN108305050B (en) * 2018-02-08 2023-04-07 贵州小爱机器人科技有限公司 Method, device, equipment and medium for extracting report information and service demand information

Also Published As

Publication number Publication date
CN109190119A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN109190119B (en) Time extraction method and device, storage medium and electronic device
CN108334533B (en) Keyword extraction method and device, storage medium and electronic device
CN109147767B (en) Method, device, computer equipment and storage medium for recognizing numbers in voice
CN107491477B (en) Emotion symbol searching method and device
CN106409295B (en) Method and device for recognizing time information from natural voice information
CN107357787B (en) Semantic interaction method and device and electronic equipment
CN108509569A (en) Generation method, device, electronic equipment and the storage medium of enterprise's portrait
CN107992523B (en) Function option searching method of mobile application and terminal equipment
CN111859093A (en) Sensitive word processing method and device and readable storage medium
CN112988753B (en) Data searching method and device
CN111563382A (en) Text information acquisition method and device, storage medium and computer equipment
CN111368504A (en) Voice data labeling method and device, electronic equipment and medium
CN113743721A (en) Marketing strategy generation method and device, computer equipment and storage medium
CN109299439B (en) Digital extraction method and apparatus, storage medium, and electronic apparatus
CN110543457A (en) Track type document processing method and device, storage medium and electronic device
CN115691503A (en) Voice recognition method and device, electronic equipment and storage medium
CN110895555A (en) Data retrieval method and device, storage medium and electronic device
CN113949887A (en) Method and device for processing network live broadcast data
CN113868576A (en) Intelligent global station building system
CN113762038A (en) Video text recognition method and device, storage medium and electronic equipment
CN110795178B (en) Application sign-in method and device and electronic equipment
CN112015773A (en) Knowledge base retrieval method and device, electronic equipment and storage medium
CN107967300B (en) Method, device and equipment for retrieving organization name and storage medium
CN110728138A (en) News text recognition method and device and storage medium
CN111538914A (en) Address information processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant