EA202092230A1 - METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES - Google Patents

METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES

Info

Publication number
EA202092230A1
EA202092230A1 EA202092230A EA202092230A EA202092230A1 EA 202092230 A1 EA202092230 A1 EA 202092230A1 EA 202092230 A EA202092230 A EA 202092230A EA 202092230 A EA202092230 A EA 202092230A EA 202092230 A1 EA202092230 A1 EA 202092230A1
Authority
EA
Eurasian Patent Office
Prior art keywords
events
sequences
transactional
representations
variables
Prior art date
Application number
EA202092230A
Other languages
Russian (ru)
Inventor
Дмитрий Леонидович БАБАЕВ
Никита Павлович Овсов
Иван Александрович Киреев
Original Assignee
Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк) filed Critical Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк)
Publication of EA202092230A1 publication Critical patent/EA202092230A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/04Payment circuits

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Neurology (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Изобретение относится к области информационных технологий, в частности к способу получения низкоразмерных числовых представлений последовательностей событий. Техническим результатом является повышение эффективности формирования признаков для моделей машинного обучения с помощью формирования низкоразмерных числовых представлений последовательностей событий. Заявленный технический результат достигается за счет компьютерно-реализуемого способа получения низкоразмерных числовых представлений последовательностей событий, содержащего этапы, на которых получают набор входных данных, характеризующий события, агрегированные в последовательность и связанные с по меньшей мере одной информационной сущностью, причем упомянутые данные содержат набор атрибутов, включающий категориальные переменные, числовые переменные и временную метку; при этом выполняется предобработка упомянутого набора входных данных, при которой формируют позитивные пары последовательностей транзакционных событий, которые представляют собой подпоследовательности, принадлежащие последовательности транзакционных событий одной информационной сущности; формируют негативные пары подпоследовательностей транзакционных событий, которые являются подпоследовательностями, принадлежащими последовательностям транзакционных событий разных информационных сущностей; с помощью кодировщика транзакционных событий формируют векторное представление каждого транзакционного события из упомянутого набора атрибутов, при этом кодировщик содержит первичный набор параметров и выполняет этапы, на которых осуществляют кодирование категориальных переменных в виде векторных представлений; осуществляют нормирование числовых переменных; осуществляют обработку временных меток для выстраивания упорядоченной по времени последовательности транзакционных событий; осуществляют конкатенацию полученных векторных представлений категориальных переменных и нормированных числовых переменных; формируют единый числовой вектор одного транзакционного события по итогам выполненной конкатенации; с помощью кодировщика подпоследовательности формируют векторное представление подпоследовательности транзакционных событий для последующего формирования формируют низкоразмерные числовых представления последовательностей событий, связанных с одной информационной сущностью.The invention relates to the field of information technology, in particular to a method for obtaining low-dimensional numerical representations of sequences of events. The technical result is to increase the efficiency of generating features for machine learning models by generating low-dimensional numerical representations of sequences of events. The claimed technical result is achieved due to a computer-implemented method for obtaining low-dimensional numerical representations of sequences of events, containing the stages at which a set of input data is obtained characterizing events aggregated in a sequence and associated with at least one information entity, and said data contains a set of attributes, including categorical variables, numeric variables, and timestamp; while preprocessing the above set of input data, which generates positive pairs of sequences of transactional events, which are subsequences belonging to the sequence of transactional events of one information entity; form negative pairs of subsequences of transactional events, which are subsequences belonging to sequences of transactional events of different information entities; using the encoder of transactional events, a vector representation of each transactional event is formed from the mentioned set of attributes, the encoder contains a primary set of parameters and performs the steps at which the categorical variables are encoded in the form of vector representations; normalization of numeric variables; processing time stamps to build a time-ordered sequence of transactional events; carry out the concatenation of the obtained vector representations of categorical variables and normalized numerical variables; form a single numerical vector of one transaction event based on the results of the performed concatenation; using a subsequence encoder, a vector representation of a subsequence of transactional events is generated for subsequent generation, low-dimensional numerical representations of sequences of events associated with one information entity are formed.

EA202092230A 2020-02-14 2020-10-20 METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES EA202092230A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
RU2020107035A RU2741742C1 (en) 2020-02-14 2020-02-14 Method for obtaining low-dimensional numeric representations of sequences of events

Publications (1)

Publication Number Publication Date
EA202092230A1 true EA202092230A1 (en) 2021-08-31

Family

ID=74554460

Family Applications (1)

Application Number Title Priority Date Filing Date
EA202092230A EA202092230A1 (en) 2020-02-14 2020-10-20 METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES

Country Status (2)

Country Link
EA (1) EA202092230A1 (en)
RU (1) RU2741742C1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0666550B1 (en) * 1994-02-08 1997-05-02 Belle Gate Investment B.V. Data exchange system comprising portable data processing units
US8595234B2 (en) * 2010-05-17 2013-11-26 Wal-Mart Stores, Inc. Processing data feeds
US8869255B2 (en) * 2010-11-30 2014-10-21 Forticom Group Ltd Method and system for abstracted and randomized one-time use passwords for transactional authentication
US10270642B2 (en) * 2012-12-05 2019-04-23 Origin Wireless, Inc. Method, apparatus, and system for object tracking and navigation
US20170228731A1 (en) * 2016-02-09 2017-08-10 Fmr Llc Computationally Efficient Transfer Processing and Auditing Apparatuses, Methods and Systems
US20170031963A1 (en) * 2015-07-27 2017-02-02 Mastercard International Incorporated Systems and methods for tracking data using user provided data tags

Also Published As

Publication number Publication date
RU2741742C1 (en) 2021-01-28

Similar Documents

Publication Publication Date Title
Satrio et al. Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET
Yin et al. TaBERT: Pretraining for joint understanding of textual and tabular data
Dey et al. A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition
Fan et al. Isnet: Individual standardization network for speech emotion recognition
Au et al. Choreograph: Music-conditioned automatic dance choreography over a style and tempo consistent dynamic graph
CN114553482B (en) Heterogeneous log-based anomaly detection network generation method and anomaly detection method
Wang et al. ENPAR: Enhancing entity and entity pair representations for joint entity relation extraction
EA202092230A1 (en) METHOD FOR OBTAINING LOW-SIZE NUMERICAL REPRESENTATIONS OF EVENT SEQUENCES
CN116432125B (en) Code Classification Method Based on Hash Algorithm
Din et al. Learning high-dimensional evolving data streams with limited labels
CN113159441A (en) Prediction method and device for implementation condition of banking business project
de Moura et al. Extracting new metrics from version control system for the comparison of software developers
CN108416365B (en) Concurrent complete log mining method based on distance
CN113221551B (en) Fine-grained sentiment analysis method based on sequence generation
Nikolentzos et al. Enhancing graph kernels via successive embeddings
CN111125375B (en) Lineage graph summarization method based on node structure similarity and semantic proximity
Hafen et al. EDA and ML--A Perfect Pair for Large-Scale Data Analysis
Brambilla et al. Improving Topic Modeling for Textual Content with Knowledge Graph Embeddings.
Li et al. Knowledge graph question answering based on TE-BiLTM and knowledge graph embedding
Lee et al. Asking clarification questions to handle ambiguity in open-domain qa
Zhang et al. Research Review of Design Pattern Mining
Rahman et al. Multilingual Program Code Classification Using $ n $-Layered Bi-LSTM Model With Optimized Hyperparameters
Wu et al. A novel method for human motion capture data segmentation
Zhou et al. RWKV-based Encoder-Decoder Model for Code Completion
Wei et al. Masked Contrastive Reconstruction for Cross-modal Medical Image-Report Retrieval