EA202092230A1 - METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES - Google Patents
METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCESInfo
- Publication number
- EA202092230A1 EA202092230A1 EA202092230A EA202092230A EA202092230A1 EA 202092230 A1 EA202092230 A1 EA 202092230A1 EA 202092230 A EA202092230 A EA 202092230A EA 202092230 A EA202092230 A EA 202092230A EA 202092230 A1 EA202092230 A1 EA 202092230A1
- Authority
- EA
- Eurasian Patent Office
- Prior art keywords
- events
- sequences
- transactional
- representations
- variables
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/04—Payment circuits
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Computing Systems (AREA)
- Neurology (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Изобретение относится к области информационных технологий, в частности к способу получения низкоразмерных числовых представлений последовательностей событий. Техническим результатом является повышение эффективности формирования признаков для моделей машинного обучения с помощью формирования низкоразмерных числовых представлений последовательностей событий. Заявленный технический результат достигается за счет компьютерно-реализуемого способа получения низкоразмерных числовых представлений последовательностей событий, содержащего этапы, на которых получают набор входных данных, характеризующий события, агрегированные в последовательность и связанные с по меньшей мере одной информационной сущностью, причем упомянутые данные содержат набор атрибутов, включающий категориальные переменные, числовые переменные и временную метку; при этом выполняется предобработка упомянутого набора входных данных, при которой формируют позитивные пары последовательностей транзакционных событий, которые представляют собой подпоследовательности, принадлежащие последовательности транзакционных событий одной информационной сущности; формируют негативные пары подпоследовательностей транзакционных событий, которые являются подпоследовательностями, принадлежащими последовательностям транзакционных событий разных информационных сущностей; с помощью кодировщика транзакционных событий формируют векторное представление каждого транзакционного события из упомянутого набора атрибутов, при этом кодировщик содержит первичный набор параметров и выполняет этапы, на которых осуществляют кодирование категориальных переменных в виде векторных представлений; осуществляют нормирование числовых переменных; осуществляют обработку временных меток для выстраивания упорядоченной по времени последовательности транзакционных событий; осуществляют конкатенацию полученных векторных представлений категориальных переменных и нормированных числовых переменных; формируют единый числовой вектор одного транзакционного события по итогам выполненной конкатенации; с помощью кодировщика подпоследовательности формируют векторное представление подпоследовательности транзакционных событий для последующего формирования формируют низкоразмерные числовых представления последовательностей событий, связанных с одной информационной сущностью.The invention relates to the field of information technology, in particular to a method for obtaining low-dimensional numerical representations of sequences of events. The technical result is to increase the efficiency of generating features for machine learning models by generating low-dimensional numerical representations of sequences of events. The claimed technical result is achieved due to a computer-implemented method for obtaining low-dimensional numerical representations of sequences of events, containing the stages at which a set of input data is obtained characterizing events aggregated in a sequence and associated with at least one information entity, and said data contains a set of attributes, including categorical variables, numeric variables, and timestamp; while preprocessing the above set of input data, which generates positive pairs of sequences of transactional events, which are subsequences belonging to the sequence of transactional events of one information entity; form negative pairs of subsequences of transactional events, which are subsequences belonging to sequences of transactional events of different information entities; using the encoder of transactional events, a vector representation of each transactional event is formed from the mentioned set of attributes, the encoder contains a primary set of parameters and performs the steps at which the categorical variables are encoded in the form of vector representations; normalization of numeric variables; processing time stamps to build a time-ordered sequence of transactional events; carry out the concatenation of the obtained vector representations of categorical variables and normalized numerical variables; form a single numerical vector of one transaction event based on the results of the performed concatenation; using a subsequence encoder, a vector representation of a subsequence of transactional events is generated for subsequent generation, low-dimensional numerical representations of sequences of events associated with one information entity are formed.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2020107035A RU2741742C1 (en) | 2020-02-14 | 2020-02-14 | Method for obtaining low-dimensional numeric representations of sequences of events |
Publications (1)
Publication Number | Publication Date |
---|---|
EA202092230A1 true EA202092230A1 (en) | 2021-08-31 |
Family
ID=74554460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EA202092230A EA202092230A1 (en) | 2020-02-14 | 2020-10-20 | METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES |
Country Status (2)
Country | Link |
---|---|
EA (1) | EA202092230A1 (en) |
RU (1) | RU2741742C1 (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0666550B1 (en) * | 1994-02-08 | 1997-05-02 | Belle Gate Investment B.V. | Data exchange system comprising portable data processing units |
US8595234B2 (en) * | 2010-05-17 | 2013-11-26 | Wal-Mart Stores, Inc. | Processing data feeds |
US8869255B2 (en) * | 2010-11-30 | 2014-10-21 | Forticom Group Ltd | Method and system for abstracted and randomized one-time use passwords for transactional authentication |
US10270642B2 (en) * | 2012-12-05 | 2019-04-23 | Origin Wireless, Inc. | Method, apparatus, and system for object tracking and navigation |
US20170228731A1 (en) * | 2016-02-09 | 2017-08-10 | Fmr Llc | Computationally Efficient Transfer Processing and Auditing Apparatuses, Methods and Systems |
US20170031963A1 (en) * | 2015-07-27 | 2017-02-02 | Mastercard International Incorporated | Systems and methods for tracking data using user provided data tags |
-
2020
- 2020-02-14 RU RU2020107035A patent/RU2741742C1/en active
- 2020-10-20 EA EA202092230A patent/EA202092230A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
RU2741742C1 (en) | 2021-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Satrio et al. | Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET | |
Yin et al. | TaBERT: Pretraining for joint understanding of textual and tabular data | |
Dey et al. | A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition | |
Fan et al. | Isnet: Individual standardization network for speech emotion recognition | |
Au et al. | Choreograph: Music-conditioned automatic dance choreography over a style and tempo consistent dynamic graph | |
CN114553482B (en) | Heterogeneous log-based anomaly detection network generation method and anomaly detection method | |
Wang et al. | ENPAR: Enhancing entity and entity pair representations for joint entity relation extraction | |
EA202092230A1 (en) | METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES | |
CN116432125B (en) | Code Classification Method Based on Hash Algorithm | |
Din et al. | Learning high-dimensional evolving data streams with limited labels | |
CN113159441A (en) | Prediction method and device for implementation condition of banking business project | |
de Moura et al. | Extracting new metrics from version control system for the comparison of software developers | |
CN108416365B (en) | Concurrent complete log mining method based on distance | |
CN113221551B (en) | Fine-grained sentiment analysis method based on sequence generation | |
Nikolentzos et al. | Enhancing graph kernels via successive embeddings | |
CN111125375B (en) | Lineage graph summarization method based on node structure similarity and semantic proximity | |
Hafen et al. | EDA and ML--A Perfect Pair for Large-Scale Data Analysis | |
Brambilla et al. | Improving Topic Modeling for Textual Content with Knowledge Graph Embeddings. | |
Li et al. | Knowledge graph question answering based on TE-BiLTM and knowledge graph embedding | |
Lee et al. | Asking clarification questions to handle ambiguity in open-domain qa | |
Zhang et al. | Research Review of Design Pattern Mining | |
Rahman et al. | Multilingual Program Code Classification Using $ n $-Layered Bi-LSTM Model With Optimized Hyperparameters | |
Wu et al. | A novel method for human motion capture data segmentation | |
Zhou et al. | RWKV-based Encoder-Decoder Model for Code Completion | |
Wei et al. | Masked Contrastive Reconstruction for Cross-modal Medical Image-Report Retrieval |