Showing 401 - 450 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Multi-Phase Gammatone Filterbank For Speech Separation Via Tasnet
In this work, we investigate if the learned encoder of the end-to-end convolutional time-domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterb
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin
In this paper, we propose a hybrid text normalization system using multi-head self-attention. The system combines the advantages of a rule-based model and a neural model for text preprocessing tasks. Previous studies in Mandarin text normalization usually
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On The Frequency Domain Detection Of High Dimensional Time Series
In this paper, we address the problem of detection, in the frequency domain, of a M-dimensional time series modeled as the output of a M ? K MIMO filter driven by a K-dimensional Gaussian white noise, and disturbed by an additive M-dimensional Gaussian co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Perceptual Loss Function For Neural Modelling Of Audio Systems
This work investigates alternate pre-emphasis filters used as part of the loss function during neural network training for nonlinear audio processing. In our previous work, the error-to-signal ratio loss function was used during network training, with a f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Minimal Personalization Of Dynamic Binaural Synthesis With Mixed Structural Modeling And Scattering Delay Networks
This paper provides a small set of essential parameters for a personalized and effective real-time auralization with headphones. An image-guided procedure with two 2D images of the user's head guides the mixed structural modeling of head-related transfer
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Filterbank Design For End-To-End Speech Separation
Single-channel speech separation has recently made great progress thanks to learned filterbanks as used in ConvTasNet. In parallel, parameterized filterbanks have been proposed for speaker recognition where only center frequencies and bandwidths are learn
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unet 3+: A Full-Scale Connected Unet For Medical Image Segmentation
Recently, a growing interest has been seen in deep learning-based semantic segmentation. UNet, which is one of deep learning networks with an encoder-decoder architecture, is widely used in medical image segmentation. Combining multi-scale features is one
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accelerating Distributed Deep Learning By Adaptive Gradient Quantization
To accelerate distributed deep learning, gradient quantization technique is widely used to reduce the communication cost. However, the existing quantization schemes suffer from either model accuracy degradation or low compression ratio (arisen from a redu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Realizability Of Planar Point Embeddings From Angle Measurements
Localization of a set of nodes is an important and a thoroughly researched problem in robotics and sensor networks. This paper is concerned with the theory of localization from inner-angle measurements. We focus on the challenging case where no anchor loc
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation For Dysarthric Speech Reconstruction
Dysarthric speech reconstruction (DSR) is a challenging task due to difficulties in repairing unstable prosody and correcting imprecise articulation. Inspired by the success of sequence-to-sequence (seq2seq) based text-to-speech (TTS) synthesis and knowle
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On Design Of Optimal Smart Meter Privacy Control Strategy Against Adversarial Map Detection
We study the optimal control problem of the maximum a posteriori (MAP) state sequence detection of an adversary using smart meter data. The privacy leakage is measured using the Bayesian risk and the privacy-enhancing control is achieved in real-time usin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Fifo Based Accelerator For Convolutional Neural Networks
In recent years, Deep Neural Networks (DNNs) have achieved state-of-the-art results in various fields like Computer Vision, Natural Language Processing and Speech Recognition. Of all the DNN architectures, Convolutional Neural Networks (CNNs) have been mo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Style And Content Separation By Minimizing Mutual Information For Speech Synthesis
We present a method to generate speech from input text and a style vector that is extracted from a reference speech signal in an unsupervised manner, i.e., no style annotation, such as speaker information, is required. Existing unsupervised methods, durin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On The Effect Of Reflectance On Phasor Field Non-Line-Of-Sight Imaging
Non-line-of-sight (NLOS) imaging aims to visualize a occluded scene by exploiting its indirect reflections on visible surfaces. Previous methods approach this problem inverting the light transport on the hidden scene, but are limited to isolated, diffuse
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Domain Invariant Representations For Child-Adult Classification From Speech
Diagnostic procedures for ASD (autism spectrum disorder) involve semi-naturalistic interactions between the child and a clinician. Computational methods to analyze these sessions require an end-to-end speech and language processing pipeline that go from r
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multimodal Violence Detection In Videos
Effective tools for detection of violence are highly demanded, specially when dealing with video streams. Such tools have a wide range of applications, from forensics and law enforcement to parental control over the ever increasing amount of videos availa
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Swax Benchmark: Attacking Biometric Systems With Wax Figures
A face spoofing attack occurs when an intruder attempts to impersonate someone who carries a gainful authentication clearance. It is a trending topic due to the increasing demand for biometric authentication on mobile devices, high-security areas, among o
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Nasil : Neural Architecture Search With Imitation Learning
Automated machine learning (AML) refers to a class of techniques that, given a problem, can find an optimal set of model architectures, properties, and parameters. In recent years, AML has shown great success in finding neural network structures that are
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Enhanced Decoding Algorithm For Coded Compressed Sensing
Coded compressed sensing is an algorithmic framework tailored to sparse recovery in very large dimensional spaces. This framework is originally envisioned for the unsourced multiple access channel, a wireless paradigm attuned to machine-type communication
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Semi-Regular Geometric Kernel Encoding & Reconstruction For Video Compression
Conventional video coding schemes employ a hybrid motion prediction / residual transform coding paradigm, which only exploits redundancy in individual pairs of video frames for compression gain. However, rigid geometric structures in 3D space---e.g., a bu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Feedback Turbo Autoencoder
Designing channel codes is one of the core research areas for modern communication systems. Canonical channel codes asymptotically achieve near-capacity performance under a large block length regime for additive white gaussian noise channels. However, thi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Oracle Search On N-Best Hypotheses
In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for each audio-hypothesis pair that is ob
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learn-By-Calibrating: Using Calibration As A Training Objective
Calibration error is commonly adopted for evaluating the quality of uncertainty estimators in deep neural networks. In this paper, we argue that such a metric is highly beneficial for training predictive models, even when we do not explicitly measure the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Privacy-Preserving Image Sharing Via Sparsifying Layers On Convolutional Groups
We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Simple Caching Schemes For Non-Homogeneous Miso Cache-Aided Communication Via Convexity
We present a novel scheme for cache-aided communication over multiple-input and single output (MISO) cellular networks. The presented scheme achieves the same number of degrees of freedom as known coded caching schemes, but, at much lower complexity. The
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Feedback Recurrent Autoencoder
In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the red
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accelerating Linear Algebra Kernels On A Massively Parallel Reconfigurable Architecture
Much of the recent work on domain-specific architectures has focused on bridging the gap between performance/efficiency and programmability. We consider one such example architecture, Transformer, consisting of light-weight cores interconnected by caches
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Arsm Gradient Estimator For Supervised Learning To Rank
We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective with respect to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dynamically Modulated Deep Metric Learning For Visual Search
This paper propose dynamically modulated metric learning (DMML) for learning a tiered similarity space to perform visual search. Existing methods often treat the training samples having different degree of information with equal importance which hinders i
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accurate Semidefinite Relaxation Method For 3-D Rigid Body Localization Using Aoa
This paper addresses the rigid body localization problem using angle-of-arrival measurements. We formulate the problem as a constrained weighted least squares (CWLS) minimization problem with the rotation matrix and position vector as variables, which is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Optimization Of Sampling Patterns And Deep Priors For Improved Parallel Mri
Multichannel imaging techniques are widely used in MRI to reduce the scan time. These schemes typically perform undersampled acquisition and utilize compressed-sensing based regularized reconstruction algorithms. Model-based deep learning (MoDL) framework
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Overlap Local-Sgd: An Algorithmic Approach To Hide Communication Delays In Distributed Sgd
Distributed stochastic gradient descent (SGD) is essential for scaling the machine learning algorithms to a large number of computing nodes. However, the infrastructures variability such as high communication delay or random node slowdown greatly impedes
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Frame-Level Phoneme-Invariant Speaker Embedding For Text-Independent Speaker Recognition On Extremely Short Utterances
This paper investigates a phoneme-invariant speaker embedding approach for speaker recognition on extremely short utterances. Intuitively, phonemes are nuisance information for text-independent speaker recognition task since the contents of the speech are
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Normalized Least-Mean-Square Algorithms With Minimax Concave Penalty
We propose a novel problem formulation for sparsity-aware adaptive filtering based on the nonconvex minimax concave (MC) penalty, aiming to obtain a sparse solution with small estimation bias. We present two algorithms: the first algorithm uses a single f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Efficiency In Large-Scale Decentralized Distributed Training
Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Supervised Deep Hashing For Efficient Audio Event Retrieval
Efficient retrieval of audio events can facilitate real-time implementation of numerous query and search-based systems. This work investigates the potency of different hashing techniques for efficient audio event retrieval. Multiple state-of-the-art weak
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multimodal Speaker Diarization Of Real-World Meetings Using D-Vectors With Spatial Features
Deep neural network based audio embeddings (d-vectors) have demonstrated superior performance in audio-only speaker diarization compared to traditional acoustic features such as mel-frequency cepstral coefficients (MFCCs) and i-vectors. However, there has
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Language Independent Gender Identification From Raw Waveform Using Multi-Scale Convolutional Neural Networks
In this work, we propose a raw waveform based multi- scale convolution neural network approach for language- independent gender identification. Our approach uses raw audio waveform as input to the 1-dimensional multi-scale convolutional neural network ins
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Multi-Speaker Speech Recognition With Transformer
Recently, fully recurrent neural network (RNN) based end-to-end models have been proven to be effective for multi-speaker speech recognition in both the single-channel and multi-channel scenarios. In this work, we explore the use of Transformer models for
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Model Of Double Descent For High-Dimensional Logistic Regression
We consider a model for logistic regression where only a subset of features of size $p$ is used for training a linear classifier over $n$ training samples. The classifier is obtained by running gradient-descent (GD) on logistic-loss. For this model, we in
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Complexity Levenberg-Marquardt Algorithm For Tensor Canonical Polyadic Decomposition
In this paper, we propose CPD-fLM++, a fast implementation of the Levenberg-Marquardt (LM) algorithm for the tensor canonical polyadic decomposition. The overall algorithmic framework follows exactly the LM approach, which enjoys locally a super-linear co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Discrete Stockwell Transforms For Infinite-Length Signals And Their Real-Time Implementations
The various forms of the Stockwell transforms (ST) introduced in the literature have been developed for off-line signal processing on finite-length signals. However, in many applications such as audio, medical or radar signal processing, signals to be ana
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graph Vertex Sampling With Arbitrary Graph Signal Hilbert Spaces
Graph vertex sampling set selection aims at selecting a set of vertices of a graph such that the space of graph signals that can be reconstructed exactly from those samples alone is maximal. In this context, we propose to extend sampling set selection bas
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Rethinking Retinal Landmark Localization As Pose Estimation: Naive Single Stacked Network For Optic Disk And Fovea Detection
Automatic detection of optic disk and fovea, the two fundamental biological landmarks of the retinal system, is crucial to track the disease progression in a diabetic patient. Recent advances in this direction were mostly limited to applying CNN based net
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dnn-Based Speech Recognition For Globalphone Languages
This paper describes new reference benchmark results based on hybrid Hidden Markov Model and Deep Neural Networks (HMM-DNN) for the GlobalPhone (GP) multilingual text and speech database. GP is a multilingual database of high-quality read speech with corr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Efficient Bird Sound Detection On The Bela Embedded System
Monitoring wildlife is an important aspect of conservation initiatives. Deep learning detectors can help with this, although it is not yet clear whether they can run efficiently on an embedded system in the wild. This paper proposes an automatic detection
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Rainrate Estimation From Highly Attenuated Downlink Signals Of Ground-Based Communications Satellite Terminals
While the use of weather radars to continuously monitor the spatio-temporal dynamics of precipitation has grown in recent years, these systems are expensive and sparsely deployed across the world. In this context, densely located ground-based terminals fo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Estimation Of Information In Parallel Gaussian Channels Via Model Order Selection
We study the problem of estimating the overall mutual information in M independent parallel discrete-time memory-less Gaussian channels from N independent data sample pairs per channel (inputs and outputs). We focus on the case where the number of active
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Distributed Tracking And Circumnavigation Using Bearing Measurements
This paper is concerned with the problem of bearings based multi-agent circumnavigation of a maneuvering target. Agents are assumed to have access to their own individual bearing measurements as well as the ones from their immediate neighbors. The aim is