Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

Li Jing,Yichen Shen,T. Dubček,J. Peurifoy,S. Skirlo,Yann LeCun,Max Tegmark,M. Soljačić

Published 2016 in International Conference on Machine Learning

ABSTRACT

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.

PUBLICATION RECORD

Publication year
2016
Venue
International Conference on Machine Learning
Publication date
2016-12-15
Fields of study
Mathematics, Computer Science, Engineering
Identifiers
arXiv 1612.05231
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNN
2016cited by this paper
Deep Learning
2016cited by this paper
Deep learning with coherent nanophotonic circuits
2016cited by this paper
Full-Capacity Unitary Recurrent Neural Networks
2016influential reference
An Optimal Design for Universal Multiport Interferometers
2016influential reference
Orthogonal RNNs and Long-Memory Tasks
2016influential reference
Unitary Evolution Recurrent Neural Networks
2015influential reference
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
2015cited by this paper
Bidirectional Recurrent Neural Networks as Generative Models
2015cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
2014cited by this paper
Fast Approximation of Rotations and Hessians matrices
2014cited by this paper
Long-term recurrent convolutional networks for visual recognition and description
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
2013cited by this paper
ImageNet classification with deep convolutional neural networks
2012cited by this paper
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
2012cited by this paper
Natural Language Processing (Almost) from Scratch
2011cited by this paper
Time-frequency feature representation using energy concentration: An overview of recent advances
2009cited by this paper
Long Short-Term Memory
1997cited by this paper
Learning long-term dependencies with gradient descent is difficult
1994cited by this paper
Experimental realization of any discrete unitary operator.
1994cited by this paper
DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1
1993cited by this paper
Untersuchungen zu dynamischen neuronalen Netzen
1991cited by this paper

CITED BY

Deep Delta Learning
2026cites this paper
Laser interferometry as a robust neuromorphic platform for machine learning
2026cites this paper
Complex-Valued Unitary Representations as Classification Heads for Improved Uncertainty Quantification in Deep Neural Networks
2026cites this paper
QFTD: An efficient quantum federated learning for transformer fault diagnosis with minimal gated unit in smart grid
2026cites this paper
Quantum-PEFT: Ultra parameter-efficient fine-tuning
2025cites this paper
Bridging Expressivity and Scalability with Adaptive Unitary SSMs
2025cites this paper
Emergence of the Primacy Effect in Structured State-Space Models
2025cites this paper
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
2025influential citation
Programmable Photonic Unitary Processor Enables Parametrized Differentiable Long-Haul Spatial Division Multiplexed Transmission
2025cites this paper
A Multi-Objective Particle Swarm Optimization Pruning on Photonic Neural Networks
2025cites this paper
Complex-Domain Decomposition with Bidirectional Convolutions Model for Speech Emotion Recognition
2025cites this paper
The role of entrepreneurial leadership in advancing economic sustainability: Evidence from Bahraini SMEs
2025cites this paper
Spectral Alignment as Predictor of Loss Explosion in Neural Network Training
2025cites this paper
Hyper Hawkes Processes: Interpretable Models of Marked Temporal Point Processes
2025influential citation
Learnability Window in Gated Recurrent Neural Networks
2025cites this paper
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
2025cites this paper
Symmetry-adapted Markov state models of closing, opening, and desensitizing in α 7 nicotinic acetylcholine receptors
2024cites this paper
Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary
2024cites this paper
Nonlinear Unitary Circuits for Photonic Neural Networks
2024cites this paper
Programmable photonic unitary circuits for light computing
2024cites this paper
IGNN-Solver: A Graph Neural Solver for Implicit Graph Neural Networks
2024cites this paper
Unitary convolutions for learning on graphs and groups
2024cites this paper
Unveiling the secrets of new physics through top quark tagging
2024cites this paper
Unlocking the Power of LSTM for Long Term Time Series Forecasting
2024cites this paper
Wideband Adaptive Beamforming for a Partially-Calibrated Distributed Array
2024cites this paper
RotRNN: Modelling Long Sequences with Rotations
2024cites this paper
Neural Gaussian Scale-Space Fields
2024cites this paper
Approximately-symmetric neural networks for quantum spin liquids
2024cites this paper
Barren plateaus in variational quantum computing
2024cites this paper
Graph Unitary Message Passing
2024influential citation
Quantized Approximately Orthogonal Recurrent Neural Networks
2024cites this paper
Temporal convolutional network for a Fast DNA mutation detection in breast cancer data
2023cites this paper
Resurrecting Recurrent Neural Networks for Long Sequences
2023cites this paper
Physics-informed dynamic mode decomposition
2023cites this paper
Symmetry-adapted Markov state models of closing, opening, and desensitizing in α7 nicotinic acetylcholine receptors
2023cites this paper
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
2023cites this paper
Implicit Graph Neural Networks: A Monotone Operator Viewpoint
2023cites this paper
Adaptive-saturated RNN: Remember more with less instability
2023cites this paper
Single-chip photonic deep neural network with forward-only training
2022cites this paper
Lyapunov-guided representation of recurrent neural network performance
2022cites this paper
Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework
2022cites this paper
DOSnet as a Non-Black-Box PDE Solver: When Deep Learning Meets Operator Splitting
2022cites this paper
Estimating the randomness of quantum circuit ensembles up to 50 qubits
2022cites this paper
Scalable and self-correcting photonic computation using balanced photonic binary tree cascades
2022cites this paper
Improving Neural Ordinary Differential Equations with Nesterov's Accelerated Gradient Method
2022cites this paper
Learning to Optimize Quasi-Newton Methods
2022influential citation
Random Orthogonal Additive Filters: A Solution to the Vanishing/Exploding Gradient of Deep Neural Networks
2022cites this paper
Omnigrok: Grokking Beyond Algorithmic Data
2022cites this paper
Quantum Vision Transformers
2022cites this paper
Image classification by combining quantum kernel learning and tensor networks
2022cites this paper
Orthogonal Gated Recurrent Unit With Neumann-Cayley Transformation
2022influential citation
Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax
2022cites this paper
Complex-Valued Neural Networks: A Comprehensive Survey
2022cites this paper
Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
2022cites this paper
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
2022cites this paper
Entangled Residual Mappings
2022cites this paper
Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs
2022cites this paper
projUNN: efficient method for training deep networks with unitary matrices
2022influential citation
Evaluation of gene expression programming and artificial neural networks in PyTorch for the prediction of local scour depth around a bridge pier
2022cites this paper
Efficient Identification of Butterfly Sparse Matrix Factorizations
2021cites this paper
Toward Hardware-Efficient Optical Neural Networks: Beyond FFT Architecture via Joint Learnability
2021cites this paper
Recurrent Neural Network from Adder's Perspective: Carry-lookahead RNN
2021cites this paper
Scalable and compact photonic neural chip with low learning-capability-loss
2021cites this paper
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
2021cites this paper
SBO-RNN: Reformulating Recurrent Neural Networks via Stochastic Bilevel Optimization
2021cites this paper
Deep Incremental RNN for Learning Sequential Data: A Lyapunov Stable Dynamical System
2021cites this paper
Heavy Ball Neural Ordinary Differential Equations
2021cites this paper
Resonance for analog recurrent neural network
2021cites this paper
Acceleration Method for Learning Fine-Layered Optical Neural Networks
2021cites this paper
Deep Unitary Convolutional Neural Networks
2021cites this paper
Coordinate descent on the orthogonal group for recurrent neural network training
2021cites this paper
Utilizing Graph Structure for Machine Learning
2021cites this paper
Training Recurrent Neural Networks via Forward Propagation Through Time
2021cites this paper
Recurrent Neural Networks for Edge Intelligence
2021influential citation
Time Adaptive Recurrent Neural Network
2021cites this paper
Parallelized Computation and Backpropagation Under Angle-Parametrized Orthogonal Matrices
2021cites this paper
Slower is Better: Revisiting the Forgetting Mechanism in LSTM for Slower Information Decay
2021cites this paper
Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity
2021cites this paper
RotLSTM: Rotating Memories in Recurrent Neural Networks
2021cites this paper
Learning with Hyperspherical Uniformity
2021cites this paper
Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework
2021cites this paper
A Differential Geometry Perspective on Orthogonal Recurrent Models
2021cites this paper
Implicit Bias of Linear RNNs
2021cites this paper
MC-LSTM: Mass-Conserving LSTM
2021cites this paper
Quantum Earth Mover's Distance: A New Approach to Learning Quantum Data
2021cites this paper
Learning quantum data with the quantum earth mover’s distance
2021cites this paper
Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization
2020influential citation
Affinity guided Geometric Semi-Supervised Metric Learning
2020cites this paper
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
2020cites this paper
A Deep Learning Algorithm for the Max-Cut Problem Based on Pointer Network Structure with Supervised Learning and Reinforcement Learning Strategies
2020cites this paper
Depth Enables Long-Term Memory for Recurrent Neural Networks
2020cites this paper
Stochastic Flows and Geometric Optimization on the Orthogonal Group
2020cites this paper
Orthogonal Over-Parameterized Training
2020cites this paper
Multi-Decoder RNN Autoencoder Based on Variational Bayes Method
2020cites this paper
Physical reservoir computing—an introductory perspective
2020cites this paper
Orthogonal Recurrent Neural Networks and Batch Normalization in Deep Neural Networks
2020influential citation
MomentumRNN: Integrating Momentum into Recurrent Neural Networks
2020cites this paper
The Recurrent Neural Tangent Kernel
2020cites this paper
Lipschitz Recurrent Neural Networks
2020cites this paper
An Ode to an ODE
2020cites this paper