CN108597501A - A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element - Google Patents

A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element Download PDF

Info

Publication number
CN108597501A
CN108597501A CN201810383059.1A CN201810383059A CN108597501A CN 108597501 A CN108597501 A CN 108597501A CN 201810383059 A CN201810383059 A CN 201810383059A CN 108597501 A CN108597501 A CN 108597501A
Authority
CN
China
Prior art keywords
layers
audio
training
residual error
stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810383059.1A
Other languages
Chinese (zh)
Inventor
夏春秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vision Technology Co Ltd
Original Assignee
Shenzhen Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vision Technology Co Ltd filed Critical Shenzhen Vision Technology Co Ltd
Priority to CN201810383059.1A priority Critical patent/CN108597501A/en
Publication of CN108597501A publication Critical patent/CN108597501A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element proposed in the present invention, main contents include:Vision stream, audio stream, classification layer and audiovisual fusion, its process is, in vision stream or audio stream, time dynamic can be by 2 layers of bidirectional valve controlled cycling element (BGRU) medelling, the BGRU outputs of right latter two signal stream can be connected and be transported in classification layer and be merged, and then carry out joint modeling to their time dynamic, finally from a Softmax layers of output, Softmax layers can be marked each frame, and labeled sequence is to be based on highest average probability.The present invention can not only simultaneously, directly extract the feature of pixel and audio volume control, it is also equipped with the text recognition in huge open context data concentration, in the case of noise intensity height, compared to the accuracy that traditional audio-visual speech identification model can significantly improve classification.

Description

A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element
Technical field
The present invention relates to audio-visual speech to identify field, is recycled based on residual error network and bidirectional valve controlled more particularly, to a kind of The audio-visual speech model of unit.
Background technology
It is significantly promoted with the performance of personal computer, human-computer interaction technology is from centered on computer, gradually It is transferred to interactive mode focusing on people, audio-visual speech identification technology is also developed rapidly in this context.Audiovisual language Sound identification technology is mainly used in phone and communication system, and people can be by voice command easily from the database of distal end It is inquired in system and extracts related information;Audio-visual speech identification technology is also largely applied interacts machine, voice note in user Thing sheet, business are self-service to be handled in the equipment such as platform, and cost of labor is greatly reduced;In terms of police criminal detection, pass through audio-visual speech Identification technology can judge the identity of suspect in conjunction with the acoustic information and facial expression information obtained.However, traditional audiovisual Speech recognition technology is mainly based upon the feature of mel-frequency cepstrum coefficient (MFCC), and uses shot and long term memory network (LSTM) time dynamic is modeled, causes its accuracy of identification in the case of strong noise not high.
A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention, in vision stream or audio In stream, time dynamic can be by 2 layers of bidirectional valve controlled cycling element (BGRU) medelling, and the BGRU outputs of right latter two signal stream can quilt It connects and is transported in classification layer and merged, joint modeling then is carried out to their time dynamic, finally from one Softmax layers of output, Softmax layers can be marked each frame, and labeled sequence is to be based on highest average probability.This Invention can not only simultaneously, directly extract the feature of pixel and audio volume control, be also equipped in huge open context number It, can be notable compared to traditional audio-visual speech identification model in the case of noise intensity height according to the text recognition of concentration Improve the accuracy of classification.
Invention content
For the problems such as accuracy of identification is not high in the case of strong noise, the purpose of the present invention is to provide one kind based on residual The audio-visual speech model of poor network and bidirectional valve controlled cycling element, in vision stream or audio stream, time dynamic can be two-way by 2 layers The BGRU outputs of gating cycle unit (BGRU) medelling, right latter two signal stream can be connected and be transported in classification layer and be carried out Fusion then carries out joint modeling to their time dynamic, and finally from a Softmax layers of output, Softmax layers can be to every One frame is marked, and labeled sequence is to be based on highest average probability.
To solve the above problems, the present invention provides a kind of audio-visual speech based on residual error network and bidirectional valve controlled cycling element Model, main contents include:
(1) vision stream;
(2) audio stream;
(3) classification layer;
(4) audiovisual is merged.
Wherein, the vision stream, which is characterized in that vision stream is by subsidiary 34 layers of residual error network (ResNet-34) 2 layers of bidirectional valve controlled cycling element (BGRU) composition of space-time convolution sum;Used herein is the version of 34 layers of identical mapping, main to flow Cheng Shi:When the output of each step becomes single dimension tensor, residual error network can continuously decrease Spatial dimensionality;Finally, The output of 34 layers of residual error network can be fed back among 2 layers of BGRU (every layer all includes 1024 grids).
Wherein, the audio stream, which is characterized in that audio stream by 18 layers of residual error network (ResNet-18) and with 2 layers BGRU connections form;18 layers of residual error Web vector graphic is standard architecture, major difference is that it uses 1D kernels, without It is the 2D kernels for image data;In order to extract fine spectral information, when often walking in a length of 0.25 millisecond of 5 milliseconds of times Core is used for first space-time convolutional layer;Identical in order to ensure the frame per second of video, the output of residual error network, which is averaged, is assigned to 29 A frame/window;Then in the residual error network after these audio frames can be transported to, these residual error networks are 3 × 1 by size Give tacit consent to kernel composition, level deeper in this way can extract long-term language feature;The output of 18 layers of residual error network can be sent to 2 Among layer BGRU (every layer all includes 1024 grids).
Wherein, the classification layer, which is characterized in that classification layer is made of 2 layers of BGRU, and the BGRU of two kinds of signal streams is defeated Go out to be connected and be transported in classification layer to be merged, joint modeling then is carried out to their time dynamic;Output layer It it is one Softmax layers, each frame can be marked in it;Labeled sequence is to be based on highest average probability.
Wherein, audiovisual fusion, which is characterized in that end-to-end audiovisual model is first audiovisual Fusion Model, its energy Enough features for simultaneously directly extracting pixel and audio volume control are also equipped with the text in huge open context data concentration Recognition capability;Its operating procedure includes:Pretreatment, evaluation and training.
Further, the pretreatment, which is characterized in that it is divided into for the pretreatment of video and for the pre- of audio Processing;For the pretreatment for video, the first step is the extraction to oral area ROI;Because being extracted oral area ROI, One fixed 98 × 98 bounding box is used all videos;Finally, each frame is all converted into gray value and by standard Change (according to population mean and variance);For the pretreatment for audio, each segment can be by carry out zero standard, also It is to say that its mean value and standard deviation are all zero, explains that loudness different degrees of between the loudspeakers changes with this.
Further, the evaluation, which is characterized in that video clip is subdivided into training set, verification collection and test set; Each word has 800 to 1000 sequences in training set, and each word respectively has 50 sequences in checksum set and test set;In total Training set, verification collection and test set have 488766,25000 and 25000 samples respectively.
Further, the training, which is characterized in that main there are two the stages:First, it stand-alone training audio stream or regards Frequency flows, and is then combined with trained audiovisual network.
Further, the stand-alone training audio stream or video flowing, which is characterized in that training is divided into initialization and end is arrived End two parts of training;For initialization, main there are three steps:First, using convolution rather than 2 layers of BGRU;Then, The aggregate (carrying Softmax layers) of residual error network and convolution can be by lasting training, until the nicety of grading of checksum set It is all no longer improved in 5 different time points;Finally, removal convolution rear end is replaced with the rear ends BGRU;For end-to-end instruction Practice, after the residual error network of each signal stream and 2 layers of BGRU are trained in advance, they can be merged into a complete signal Stream is to be trained (use Softmax output layers) end to end;It is end-to-end using Adam training algorithms, mainly use 36 The small lot of a sequence and 0.0003 initial learning rate;Stop operation after 5 time points.
Further, the audio visual network is complexed and trains, which is characterized in that merge training be divided into initialization and it is end-to-end Two parts of training;For initialization, once the single signal stream of each completes training, they will be used to initialize Corresponding signal stream in multithread framework;Then, in addition 2 layers of BGRU can be added on all signal streams to merge single signal The output of stream;This 2 layers of BGRU can be trained first in 5 time points and (be used Softmax output layers), to keep audio stream and regard The weight of frequency stream is stablized;For end-to-end training, entire audiovisual network, which is merged together, to be trained, and is instructed using Adam Practice algorithm, that is, uses the small lot of 18 sequences and 0.0001 initial learning rate;Stop fortune after 5 time points It calculates.
Description of the drawings
Fig. 1 is a kind of system framework of the audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention Figure.
Fig. 2 is a kind of flow chart of the audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention.
Fig. 3 is a kind of ROI extractions of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention.
Specific implementation mode
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase It mutually combines, invention is further described in detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is a kind of system framework of the audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention Figure.Include mainly vision stream, audio stream, classify layer and audiovisual fusion.
Vision stream is by 2 layers of bidirectional valve controlled cycling element of space-time convolution sum of subsidiary 34 layers of residual error network (ResNet-34) (BGRU) it forms;Used herein is the version of 34 layers of identical mapping, and main flow is:When the output of each step becomes When single dimension tensor, residual error network can continuously decrease Spatial dimensionality;Finally, the output of 34 layers of residual error network can be fed back to 2 Among layer BGRU (every layer all includes 1024 grids).
Audio stream is made of 18 layers of residual error network (ResNet-18) and being connect with 2 layers of BGRU;18 layers of residual error network make It is standard architecture, major difference is that it uses 1D kernels, rather than for the 2D kernels of image data;In order to Fine spectral information is extracted, a length of 0.25 millisecond of 5 milliseconds of time kernels are used for first space-time convolutional layer when often walking;For Ensure that the frame per second of video is identical, the output of residual error network is averaged is assigned to 29 frame/windows;Then these audio frames can quilt In residual error network after being transported to, these residual error networks are made of the acquiescence kernel that size is 3 × 1, level deeper in this way Long-term language feature can be extracted;The output of 18 layers of residual error network can be sent to 2 layers of BGRU, and (every layer all includes 1024 nets Lattice) among.
Classify layer, be made of 2 layers of BGRU, the BGRU outputs of two kinds of signal streams can be connected and be transported in classification layer with It is merged, joint modeling then is carried out to their time dynamic;Output layer is one Softmax layers, it can be to each Frame is marked;Labeled sequence is to be based on highest average probability.
Audiovisual is merged, and end-to-end audio-visual speech model is first audiovisual Fusion Model, it simultaneously can directly be extracted The feature of pixel and audio volume control is also equipped with the text recognition in huge open context data concentration;Its operation Step includes:Pretreatment, evaluation and training.
Fig. 2 is a kind of flow chart of the audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention. This figure shows the workflows of this audio-visual speech model:In vision stream or audio stream, time dynamic can be by 2 layers of bidirectional valve controlled The BGRU outputs of cycling element (BGRU) medelling, right latter two signal stream, which can be connected and are transported in classification layer, is melted It closes, joint modeling then is carried out to their time dynamic, finally from a Softmax layers of output, Softmax layers can be to each Frame is marked, and labeled sequence is to be based on highest average probability.
Fig. 3 is a kind of ROI extractions of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element of the present invention. This figure shows this audio-visual speech models to the extracting mode of ROI:One fixed 98 × 98 side is used all videos Boundary's frame;Finally, each frame is all converted into gray value and is standardized (according to population mean and variance).
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's Protection domain.Therefore, the following claims are intended to be interpreted as including preferred embodiment and falls into all changes of the scope of the invention More and change.

Claims (10)

1. a kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element, which is characterized in that main includes regarding Feel stream (one);Audio stream (two);Classification layer (three);(4) are merged in audiovisual.
2. based on the vision stream (one) described in claims 1, which is characterized in that vision stream is by subsidiary 34 layers of residual error network (ResNet-34) 2 layers of bidirectional valve controlled cycling element (BGRU) composition of space-time convolution sum;Used herein is 34 layers of identical mapping Version, main flow is:When the output of each step becomes single dimension tensor, when residual error network can continuously decrease Empty dimension;Finally, the output of 34 layers of residual error network can be fed back among 2 layers of BGRU (every layer all includes 1024 grids).
3. based on the audio stream (two) described in claims 1, which is characterized in that audio stream is by 18 layers of residual error network (ResNet- 18) and with 2 layers of BGRU it connect composition;18 layers of residual error Web vector graphic is standard architecture, major difference is that it was used It is 1D kernels, rather than for the 2D kernels of image data;It is 0.25 millisecond a length of when often walking in order to extract fine spectral information 5 milliseconds of time kernels be used for first space-time convolutional layer;It is identical in order to ensure the frame per second of video, the output quilt of residual error network It is evenly distributed to 29 frame/windows;Then in the residual error network after these audio frames can be transported to, these residual error networks by The acquiescence kernel that size is 3 × 1 forms, and level deeper in this way can extract long-term language feature;18 layers of residual error network Output can be sent among 2 layers of BGRU (every layer all includes 1024 grids).
4. based on the classification layer (three) described in claims 1, which is characterized in that classification layer is made of 2 layers of BGRU, two kinds of letters Number stream BGRU output can be connected and be transported to classification floor in be merged, then to their time dynamic combine Modeling;Output layer is one Softmax layers, and each frame can be marked in it;Labeled sequence is based on highest flat Equal probability.
5. merging (four) based on the audiovisual described in claims 1, which is characterized in that end-to-end audio-visual speech model is first regards Fusion Model is listened, it simultaneously can directly extract the feature of pixel and audio volume control, be also equipped in huge open language Text recognition in the data set of border;Its operating procedure includes:Pretreatment, evaluation and training.
6. based on the pretreatment described in claims 5, which is characterized in that it is divided into the pretreatment for video and is directed to audio Pretreatment;For the pretreatment for video, the first step is extracted to oral area area-of-interest (ROI);Because extracted Oral area ROI, so using all videos one fixed 98 × 98 bounding box;Finally, each frame is all converted into Gray value is simultaneously standardized (according to population mean and variance);For the pretreatment for audio, each segment can by into Row zero standard, that is to say, that its mean value and standard deviation be all zero, and sound different degrees of between the loudspeakers is explained with this Degree variation.
7. based on the evaluation described in claims 5, which is characterized in that video clip is subdivided into training set, verification collection and surveys Examination collection;Each word has 800 to 1000 sequences in training set, and each word respectively has 50 sequences in checksum set and test set; Training set, verification collection and test set in total has 488766,25000 and 25000 samples respectively.
8. based on the training described in claims 5, which is characterized in that main there are two the stages:First, stand-alone training audio stream Or video flowing, it is then combined with trained audiovisual network.
9. based on stand-alone training audio stream or video flowing described in claims 8, which is characterized in that training be divided into initialization and Two parts of end-to-end training;For initialization, main there are three steps:First, using convolution rather than 2 layers of BGRU; Then, the aggregate (carrying Softmax layers) of residual error network and convolution can be by lasting training, until the classification of checksum set Precision is all no longer improved in 5 different time points;Finally, removal convolution rear end is replaced with the rear ends BGRU;For end-to-end Training, after the residual error network of each signal stream and 2 layers of BGRU are trained in advance, they can be merged into a complete letter Number stream to be trained (use Softmax output layers) end to end;It is end-to-end using Adam training algorithms, it is main to use The small lot of 36 sequences and 0.0003 initial learning rate;Stop operation after 5 time points.
10. being complexed and training based on audio visual network according to any one of claims 8, which is characterized in that merge training and be divided into initialization and end To end two parts of training;For initialization, once the single signal stream of each completes training, they will be used to just Corresponding signal stream in beginningization multithread framework;Then, in addition 2 layers of BGRU can be added on all signal streams to merge single The output of signal stream;This 2 layers of BGRU can be trained first in 5 time points and (be used Softmax output layers), to keep audio stream Stablize with the weight of video flowing;For end-to-end training, entire audiovisual network, which is merged together, to be trained, using Adam training algorithms use the small lot of 18 sequences and 0.0001 initial learning rate;Stop after 5 time points Only operation.
CN201810383059.1A 2018-04-26 2018-04-26 A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element Withdrawn CN108597501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810383059.1A CN108597501A (en) 2018-04-26 2018-04-26 A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810383059.1A CN108597501A (en) 2018-04-26 2018-04-26 A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element

Publications (1)

Publication Number Publication Date
CN108597501A true CN108597501A (en) 2018-09-28

Family

ID=63609339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810383059.1A Withdrawn CN108597501A (en) 2018-04-26 2018-04-26 A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element

Country Status (1)

Country Link
CN (1) CN108597501A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109801621A (en) * 2019-03-15 2019-05-24 三峡大学 A kind of audio recognition method based on residual error gating cycle unit
CN110097541A (en) * 2019-04-22 2019-08-06 电子科技大学 A kind of image of no reference removes rain QA system
CN110600053A (en) * 2019-07-30 2019-12-20 广东工业大学 Cerebral stroke dysarthria risk prediction method based on ResNet and LSTM network
CN110865705A (en) * 2019-10-24 2020-03-06 中国人民解放军军事科学院国防科技创新研究院 Multi-mode converged communication method and device, head-mounted equipment and storage medium
CN111128122A (en) * 2019-12-31 2020-05-08 苏州思必驰信息科技有限公司 Method and system for optimizing rhythm prediction model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
STAVROS PETRIDIS等: ""END-TO-END AUDIOVISUAL SPEECH RECOGNITION"", 《网页在线公开:HTTPS://ARXIV.ORG/ABS/1802.06424V2》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109801621A (en) * 2019-03-15 2019-05-24 三峡大学 A kind of audio recognition method based on residual error gating cycle unit
CN110097541A (en) * 2019-04-22 2019-08-06 电子科技大学 A kind of image of no reference removes rain QA system
CN110097541B (en) * 2019-04-22 2023-03-28 电子科技大学 No-reference image rain removal quality evaluation system
CN110600053A (en) * 2019-07-30 2019-12-20 广东工业大学 Cerebral stroke dysarthria risk prediction method based on ResNet and LSTM network
CN110865705A (en) * 2019-10-24 2020-03-06 中国人民解放军军事科学院国防科技创新研究院 Multi-mode converged communication method and device, head-mounted equipment and storage medium
CN110865705B (en) * 2019-10-24 2023-09-19 中国人民解放军军事科学院国防科技创新研究院 Multi-mode fusion communication method and device, head-mounted equipment and storage medium
CN111128122A (en) * 2019-12-31 2020-05-08 苏州思必驰信息科技有限公司 Method and system for optimizing rhythm prediction model

Similar Documents

Publication Publication Date Title
Chen et al. Lip movements generation at a glance
CN108597501A (en) A kind of audio-visual speech model based on residual error network and bidirectional valve controlled cycling element
US10621991B2 (en) Joint neural network for speaker recognition
JP6993353B2 (en) Neural network-based voiceprint information extraction method and device
US11862145B2 (en) Deep hierarchical fusion for machine intelligence applications
WO2020119630A1 (en) Multi-mode comprehensive evaluation system and method for customer satisfaction
CN112069484A (en) Multi-mode interactive information acquisition method and system
CN109344781A (en) Expression recognition method in a kind of video based on audio visual union feature
CN108269133A (en) A kind of combination human bioequivalence and the intelligent advertisement push method and terminal of speech recognition
CN111292765B (en) Bimodal emotion recognition method integrating multiple deep learning models
CN102930297B (en) Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion
KR102167760B1 (en) Sign language analysis Algorithm System using Recognition of Sign Language Motion process and motion tracking pre-trained model
Tao et al. End-to-end audiovisual speech activity detection with bimodal recurrent neural models
CN115329779A (en) Multi-person conversation emotion recognition method
CN112151030A (en) Multi-mode-based complex scene voice recognition method and device
CN109829499A (en) Image, text and data fusion sensibility classification method and device based on same feature space
CN107358947A (en) Speaker recognition methods and system again
Argones Rua et al. Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models
CN112101096A (en) Suicide emotion perception method based on multi-mode fusion of voice and micro-expression
CN114724224A (en) Multi-mode emotion recognition method for medical care robot
Saiful et al. Real-time sign language detection using cnn
Ivanko et al. An experimental analysis of different approaches to audio–visual speech recognition and lip-reading
CN113326868A (en) Decision layer fusion method for multi-modal emotion classification
CN116758451A (en) Audio-visual emotion recognition method and system based on multi-scale and global cross attention
CN116434786A (en) Text-semantic-assisted teacher voice emotion recognition method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20180928

WW01 Invention patent application withdrawn after publication