JP2019535082A - 言語検出のためのシステムおよび方法 - Google Patents
言語検出のためのシステムおよび方法 Download PDFInfo
- Publication number
- JP2019535082A JP2019535082A JP2019517966A JP2019517966A JP2019535082A JP 2019535082 A JP2019535082 A JP 2019535082A JP 2019517966 A JP2019517966 A JP 2019517966A JP 2019517966 A JP2019517966 A JP 2019517966A JP 2019535082 A JP2019535082 A JP 2019535082A
- Authority
- JP
- Japan
- Prior art keywords
- language
- text message
- scores
- module
- detection test
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 204
- 238000000034 method Methods 0.000 title claims abstract description 195
- 238000013515 script Methods 0.000 claims abstract description 95
- 238000011012 sanitization Methods 0.000 claims abstract description 59
- 238000012360 testing method Methods 0.000 claims abstract description 59
- 241001417495 Serranidae Species 0.000 claims description 26
- 238000012545 processing Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 abstract description 12
- 238000012706 support-vector machine Methods 0.000 description 22
- 238000012549 training Methods 0.000 description 15
- 239000013598 vector Substances 0.000 description 12
- 230000008569 process Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/263—Language identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/283,646 US10162811B2 (en) | 2014-10-17 | 2016-10-03 | Systems and methods for language detection |
US15/283,646 | 2016-10-03 | ||
PCT/US2017/054722 WO2018067440A1 (en) | 2016-10-03 | 2017-10-02 | Systems and methods for language detection |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2019535082A true JP2019535082A (ja) | 2019-12-05 |
Family
ID=60162256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019517966A Pending JP2019535082A (ja) | 2016-10-03 | 2017-10-02 | 言語検出のためのシステムおよび方法 |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP3519984A1 (de) |
JP (1) | JP2019535082A (de) |
CN (1) | CN110023931A (de) |
AU (1) | AU2017339433A1 (de) |
CA (1) | CA3039085A1 (de) |
WO (1) | WO2018067440A1 (de) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2023511791A (ja) * | 2020-04-10 | 2023-03-22 | キヤノン オイローパ エヌ.ヴェー. | テキスト分類 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113269009A (zh) * | 2020-02-14 | 2021-08-17 | 微软技术许可有限责任公司 | 图像中的文本识别 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7552045B2 (en) * | 2006-12-18 | 2009-06-23 | Nokia Corporation | Method, apparatus and computer program product for providing flexible text based language identification |
US8107671B2 (en) * | 2008-06-26 | 2012-01-31 | Microsoft Corporation | Script detection service |
US8326602B2 (en) * | 2009-06-05 | 2012-12-04 | Google Inc. | Detecting writing systems and languages |
EP3207465A1 (de) * | 2014-10-17 | 2017-08-23 | Machine Zone, Inc. | System und verfahren zur sprachdetektion |
-
2017
- 2017-10-02 JP JP2019517966A patent/JP2019535082A/ja active Pending
- 2017-10-02 EP EP17788004.4A patent/EP3519984A1/de not_active Withdrawn
- 2017-10-02 CA CA3039085A patent/CA3039085A1/en not_active Abandoned
- 2017-10-02 AU AU2017339433A patent/AU2017339433A1/en not_active Abandoned
- 2017-10-02 WO PCT/US2017/054722 patent/WO2018067440A1/en active Application Filing
- 2017-10-02 CN CN201780074219.8A patent/CN110023931A/zh active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2023511791A (ja) * | 2020-04-10 | 2023-03-22 | キヤノン オイローパ エヌ.ヴェー. | テキスト分類 |
JP7282989B2 (ja) | 2020-04-10 | 2023-05-29 | キヤノン オイローパ エヌ.ヴェー. | テキスト分類 |
Also Published As
Publication number | Publication date |
---|---|
AU2017339433A1 (en) | 2019-05-02 |
EP3519984A1 (de) | 2019-08-07 |
WO2018067440A1 (en) | 2018-04-12 |
CA3039085A1 (en) | 2018-04-12 |
CN110023931A (zh) | 2019-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9535896B2 (en) | Systems and methods for language detection | |
US10699073B2 (en) | Systems and methods for language detection | |
JP5475795B2 (ja) | カスタム言語モデル | |
JP5379138B2 (ja) | 領域辞書の作成 | |
US8380488B1 (en) | Identifying a property of a document | |
US20190087417A1 (en) | System and method for translating chat messages | |
JP2019504413A (ja) | 絵文字を提案するためのシステムおよび方法 | |
JP6553180B2 (ja) | 言語検出を行うためのシステムおよび方法 | |
CA3089001A1 (en) | System and method for language translation | |
CN109299228B (zh) | 计算机执行的文本风险预测方法及装置 | |
KR101326354B1 (ko) | 문자 변환 처리 장치, 기록 매체 및 방법 | |
WO2018093926A1 (en) | Semi-supervised training of neural networks | |
KR20120042829A (ko) | 쓰기 체계 및 언어 검출 | |
CN111859940A (zh) | 一种关键词提取方法、装置、电子设备及存储介质 | |
Ozer et al. | Diacritic restoration of Turkish tweets with word2vec | |
US12086544B2 (en) | Sentiment analysis | |
JP2019535082A (ja) | 言語検出のためのシステムおよび方法 | |
WO2014068293A1 (en) | Text analysis | |
JP2017151933A (ja) | データ分類装置、データ分類方法、及びプログラム | |
JP6605997B2 (ja) | 学習装置、学習方法及びプログラム | |
JP2015018372A (ja) | 表現抽出モデル学習装置、表現抽出モデル学習方法、および、コンピュータ・プログラム | |
JP2019215876A (ja) | 言語検出を行うためのシステムおよび方法 | |
JP2020052819A (ja) | 情報処理装置、情報処理方法及びプログラム | |
JP2017191358A (ja) | 地名表記判定装置 |