JPWO2020263675A5 - - Google Patents
Download PDFInfo
- Publication number
- JPWO2020263675A5 JPWO2020263675A5 JP2021539860A JP2021539860A JPWO2020263675A5 JP WO2020263675 A5 JPWO2020263675 A5 JP WO2020263675A5 JP 2021539860 A JP2021539860 A JP 2021539860A JP 2021539860 A JP2021539860 A JP 2021539860A JP WO2020263675 A5 JPWO2020263675 A5 JP WO2020263675A5
- Authority
- JP
- Japan
- Prior art keywords
- character sequences
- selection
- data
- embedded
- negative
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Claims (13)
1つまたは複数のプロセッサを備える正規表現生成器が、1つまたは複数の陽性のキャラクタシーケンスを含む第1の選択を受け取ることを含み、前記1つまたは複数の陽性のキャラクタシーケンスの各々は、前記正規表現生成器によって生成される正規表現によってマッチされるべき陽性例に対応し、前記方法はさらに、
前記正規表現生成器が第1の正規表現を生成することを含み、前記第1の正規表現は前記陽性例にマッチし、前記方法はさらに、
前記正規表現生成器が、1つまたは複数の陰性のキャラクタシーケンスを含む第2の選択を受け取ることを含み、前記1つまたは複数の陰性のキャラクタシーケンスの各々は、前記正規表現生成器によって生成される前記正規表現によってマッチされるべきでない陰性例に対応し、前記方法はさらに、
前記第2の選択を受け取ることに応答して、前記陰性例に対応する前記1つまたは複数の陰性のキャラクタシーケンスのコンテキストを判断することと、
前記1つまたは複数の陰性のキャラクタシーケンスの前記判断されたコンテキストに基づいて、前記第1の正規表現を更新することとを含む、方法。 A method of generating a regular expression comprising:
A regular expression generator comprising one or more processors receives a first selection including one or more positive character sequences, each of said one or more positive character sequences comprising said Corresponding to positive examples to be matched by the regular expression generated by the regular expression generator, the method further comprises:
The method includes the regular expression generator generating a first regular expression, the first regular expression matching the positive examples, the method further comprising:
the regular expression generator receiving a second selection including one or more negative character sequences, each of the one or more negative character sequences generated by the regular expression generator; corresponding to negative examples that should not be matched by said regular expression, said method further comprising:
Determining the context of the one or more negative character sequences corresponding to the negative example in response to receiving the second selection;
and updating the first regular expression based on the determined context of the one or more negative character sequences.
前記第2の選択の、埋め込まれた強調表示位置を識別することと、
前記第2の選択の前記埋め込まれた強調表示位置の左側のデータからコンテキストを判断することと、
前記強調表示された第2の選択の前記埋め込まれた強調表示位置の右側のデータからコンテキストを判断することととを含む、請求項6に記載の方法。 Determining the context of the one or more negative character sequences corresponding to the negative examples includes:
identifying embedded highlight locations of the second selection;
determining context from data to the left of the embedded highlight location of the second selection;
and determining context from data to the right of the embedded highlight location of the highlighted second selection.
前記埋め込まれた強調表示位置の左側のデータからの前記判断されたコンテキストおよび前記埋め込まれた強調表示位置の右側のデータからの前記判断されたコンテキストに基づいて自動的に選択された、前記1つまたは複数の陰性のキャラクタシーケンスを含む前記第1の選択に対応する、前記データセット内の前記複数のデータセル内におけるキャラクタシーケンスをフィルタリングすることと、
前記データセット内の前記複数のデータセル内の前記選択されたキャラクタシーケンスから、前記選択された1つまたは複数の陰性のキャラクタシーケンスに対応する前記フィルタリングされたキャラクタシーケンスを除去することとを含む、請求項7に記載の方法。 Determining the context of the one or more negative character sequences corresponding to the negative examples further comprises:
the one automatically selected based on the determined context from the data to the left of the embedded highlight position and the determined context from the data to the right of the embedded highlight position; or filtering character sequences within the plurality of data cells in the data set corresponding to the first selection including a plurality of negative character sequences;
removing the filtered character sequences corresponding to the selected one or more negative character sequences from the selected character sequences in the plurality of data cells in the data set; 8. The method of claim 7.
前記選択された1つまたは複数の陰性のキャラクタシーケンスに対応する前記データセット内の前記複数のデータセル内の前記キャラクタシーケンスをフィルタリングすることは、前記選択された1つまたは複数の陰性のキャラクタシーケンスに対応する前記複数のデータセル内の前記キャラクタシーケンスにおいて、前記埋め込まれた強調表示位置の左側の前記第1のスパンとマッチしないスパンを識別することをさらに含む、請求項8に記載の方法。 determining the context from data to the left of the embedded highlight location includes identifying a first span to the left of the embedded highlight location;
Filtering the character sequences in the plurality of data cells in the data set corresponding to the selected one or more negative character sequences comprises: 9. The method of claim 8, further comprising identifying spans in the character sequences in the plurality of data cells corresponding to which do not match the first span to the left of the embedded highlight position.
前記選択された1つまたは複数の陰性のキャラクタシーケンスに対応する前記データセット内の前記複数のデータセル内の前記キャラクタシーケンスをフィルタリングすることは、前記選択された1つまたは複数の陰性のキャラクタシーケンスに対応する前記複数のデータセル内の前記キャラクタシーケンスにおいて、前記埋め込まれた強調表示位置の左側の前記第2のスパンとマッチしないスパンを識別することをさらに含む、請求項9に記載の方法。 determining context from data to the left of the embedded highlight location further includes identifying a second span to the left of the embedded highlight;
Filtering the character sequences in the plurality of data cells in the data set corresponding to the selected one or more negative character sequences comprises: 10. The method of claim 9, further comprising identifying spans in the character sequences in the plurality of data cells corresponding to which do not match the second span to the left of the embedded highlight position.
前記1つまたは複数の陰性のキャラクタシーケンスを含む第2の選択に対応する前記データセット中の前記複数のデータセル中の前記キャラクタシーケンスをフィルタリングすることは、前記1つまたは複数の陰性のキャラクタシーケンスを含む第2の選択に対応する前記複数のデータセル中の前記キャラクタシーケンスにおいて、前記埋め込まれた強調表示位置の右側の前記第1のスパンとマッチしないスパンを識別することをさらに含む、請求項7に記載の方法。 determining the context from data to the right of the embedded highlight location includes identifying a first span to the right of the embedded highlight location;
Filtering the character sequences in the plurality of data cells in the data set corresponding to a second selection containing the one or more negative character sequences comprises: and further comprising identifying a span that does not match the first span to the right of the embedded highlight position in the sequence of characters in the plurality of data cells corresponding to a second selection comprising 7. The method according to 7.
プロセッサと、
メモリと、
前記プロセッサに結合される記憶媒体とを備え、前記記憶媒体は、請求項1から11のいずれか1項に記載の方法を実現するために前記プロセッサによって実行可能な命令を格納する、正規表現生成器サーバコンピュータ。 A regular expression generator server computer,
a processor;
memory;
a storage medium coupled to the processor, the storage medium storing instructions executable by the processor to implement the method of any one of claims 1 to 11. instrument server computer.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962865797P | 2019-06-24 | 2019-06-24 | |
US62/865,797 | 2019-06-24 | ||
US16/904,298 US11941018B2 (en) | 2018-06-13 | 2020-06-17 | Regular expression generation for negative example using context |
US16/904,298 | 2020-06-17 | ||
PCT/US2020/038431 WO2020263675A1 (en) | 2019-06-24 | 2020-06-18 | Regular expression generation for negative example using context |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2022538705A JP2022538705A (en) | 2022-09-06 |
JPWO2020263675A5 true JPWO2020263675A5 (en) | 2023-06-02 |
Family
ID=71575795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2021539860A Pending JP2022538705A (en) | 2019-06-24 | 2020-06-18 | Regular expression generation for negative examples with context |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP3987407A1 (en) |
JP (1) | JP2022538705A (en) |
CN (1) | CN113424177A (en) |
WO (1) | WO2020263675A1 (en) |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10210246B2 (en) | 2014-09-26 | 2019-02-19 | Oracle International Corporation | Techniques for similarity analysis and data enrichment using knowledge sources |
US9817875B2 (en) * | 2014-10-28 | 2017-11-14 | Conduent Business Services, Llc | Methods and systems for automated data characterization and extraction |
-
2020
- 2020-06-18 EP EP20739513.8A patent/EP3987407A1/en active Pending
- 2020-06-18 WO PCT/US2020/038431 patent/WO2020263675A1/en unknown
- 2020-06-18 CN CN202080014445.9A patent/CN113424177A/en active Pending
- 2020-06-18 JP JP2021539860A patent/JP2022538705A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JPWO2019241428A5 (en) | ||
CN111209000B (en) | Processing method and device of custom control, electronic equipment and storage medium | |
JP2008547128A5 (en) | ||
CN103440304B (en) | A kind of picture storage method and storage device | |
JP2009521026A (en) | Method and system for editing text with search and replace functions that leverage derivation of search and replace inputs | |
CN104408034A (en) | Text big data-oriented Chinese word segmentation method | |
WO2019136855A1 (en) | Method and apparatus for implementing multidimensional analysis on insurance policy, terminal device, and storage medium | |
CN109491708A (en) | Document structure tree method, system, equipment and medium | |
Lee et al. | Data structures and algorithms with python | |
US7409410B2 (en) | System and method of presenting multilingual metadata | |
EP2869195B1 (en) | Application coordination system, application coordination method, and application coordination program | |
CN114021573B (en) | Natural language processing method, device, equipment and readable storage medium | |
WO2014020834A1 (en) | Word latent topic estimation device and word latent topic estimation method | |
CN106776779B (en) | Method for generating entity file by JSON data based on Mac platform | |
CN109410656A (en) | It is a kind of that bootstrap technique and facility for study are recited based on melody synthesis | |
CN110209780A (en) | A kind of question template generation method, device, server and storage medium | |
JPWO2020263675A5 (en) | ||
CN116400910A (en) | Code performance optimization method based on API substitution | |
CN112463896B (en) | Archive catalogue data processing method, archive catalogue data processing device, computing equipment and storage medium | |
CN114218261A (en) | Data query method and device, storage medium and electronic equipment | |
KR102076550B1 (en) | Spreadsheet editing apparatus and method | |
JP5803481B2 (en) | Information processing apparatus and information processing program | |
JP2019144873A (en) | Block diagram analyzer | |
CN109325093A (en) | Bibliography automatic generation method, device and computer-readable storage medium | |
US9542569B2 (en) | Information processing system, information processing apparatus, storage medium having stored therein information processing program, and method of storing saved data |