Five papers are selected to represent our research in recent years. A complete list of publications is available at my Google Scholar page.
Acknowledgement. The research results presented on this page are supported by the grants NSF IIS 1546482-BIGDATA, NIH R01MH102339, NSF IIS1408910, NSF IIS1332109, NIH R01GM083084, NIH R01HG06841.
DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, Han Liu
International Conference on Learning Representations (ICLR), 2024.
Blurb. This paper introduces DNABERT-2, a refined genome foundation model that adapts an efficient tokenizer and employs multiple strategies to overcome input length constraints, reduce time and memory expenditure, and enhance model capability. It also proposes the Genome Understanding Evaluation (GUE), a comprehensive multi-species genome classification dataset that amalgamates 28 distinct datasets across 7 tasks, with input lengths ranging from 70 to 1000.
STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction
Dennis Wu*, Jerry Yao-Chieh Hu*, Weijian Li*, Bo-Yu Chen, Han Liu
International Conference on Learning Representations (ICLR), 2024.
Blurb. This paper introduces STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities. At the heart of our approach is STanHop, a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations in a data-dependent fashion. In essence, STanHop sequentially learn temporal representation and cross-series representation using two tandem sparse Hopfield layers
On Sparse Modern Hopfield Model
Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu
Advances in Neural Information Processing Systems (NeurIPS), 2023.
Blurb. This paper introduces the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield model equips a memory-retrieval dynamics whose one-step approximation corresponds to the sparse attention mechanism. Theoretically, our key contribution is a principled derivation of a closed-form sparse Hopfield energy using the convex conjugate of the sparse entropic regularizer. Building upon this, we derive the sparse memory retrieval dynamics from the sparse energy function and show its one-step approximation is equivalent to the sparse-structured attention.
Feature Programming for Multivariate Time Series Prediction
Alex Reneau*, Jerry Yao-Chieh Hu*, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu
International Conference on Machine Learning (ICML), 2023.
Blurb. This paper ntroduces the concept of programmable feature engineering for time series modeling and propose a feature programming framework. This framework generates large amounts of predictive features for noisy multivariate time series while allowing users to incorporate their inductive bias with minimal effort. The key motivation of our framework is to view any multivariate time series as a cumulative sum of fine-grained trajectory increments, with each increment governed by a novel spin-gas dynamical Ising model. This fine-grained perspective motivates the development of a parsimonious set of operators that summarize multivariate time series in an abstract fashion, serving as the foundation for large-scale automated feature engineering.
Bregman Proximal Langevin Monte Carlo via Bregman-Moreau Envelopes
Tim Tsz-Kit Lau, Han Liu
International Conference on Machine Learning (ICML), 2022.
Blurb. This paper proposes efficient Langevin Monte Carlo algorithms for sampling distributions with nonsmooth convex composite potentials, which is the sum of a continuously differentiable function and a possibly nonsmooth function. We devise such algorithms leveraging recent advances in convex analysis and optimization methods involving Bregman divergences, namely the Bregman--Moreau envelopes and the Bregman proximity operators, and in the Langevin Monte Carlo algorithms reminiscent of mirror descent.