Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions

Published 2008 in The Annals of Applied Statistics

ABSTRACT

We propose a distance between two realizations of a random process where for each realization only sparse and irregularly spaced measurements with additional measurement errors are available. Such data occur commonly in longitudinal studies and online trading data. A distance measure then makes it possible to apply distance-based analysis such as classification, clustering and multidimensional scaling for irregularly sampled longitudinal data. Once a suitable distance measure for sparsely sampled longitudinal trajectories has been found, we apply distance-based clustering methods to eBay online auction data. We identify six distinct clusters of bidding patterns. Each of these bidding patterns is found to be associated with a specific chance to obtain the auctioned item at a reasonable price. 1. Introduction. The goal of cluster analysis is to group a collection of subjects into clusters, such that those falling into the same cluster are more similar to each other than those in different clusters. Therefore, a measure of similarity or dissimilarity between subjects is a necessary ingredient for clustering. A metric defined on the subject space is one way to obtain dissimilarities, simply using the distance between two subjects as a measure of dissimilarity. While one can readily choose from a variety of well-known metrics for the case of classical multivariate data, or for functional data that are in the form of continuously observed trajectories, finding a suitable distance measure for irregularly observed data can be a challenge. One such situation which we study here occurs in the commonly encountered case of irregularly and sparsely observed longitudinal data, with online auction data a prominent example [Shmueli and Jank (2005), Jank and Shmueli (2006), Shmueli, Russo and Jank (2007), Liu and Muller (2008)]. As an example, a snapshot of an eBay auction history for a Palm Personal Digital Assistant is shown in Figure1. In this paper the focus is on a traditional clustering framework, where it is assumed that each subject belongs to exactly one cluster. There are alternative clustering ideas such as soft clustering [Erosheva and Fienberg (2005)] or mixed membership clustering [Erosheva, Fienberg and Lafferty (2004)]. For example, in Erosheva, Fienberg and Joutard (2007), functional disability data are

PUBLICATION RECORD

Publication year
2008
Venue
The Annals of Applied Statistics
Publication date
2008-05-05
Fields of study
Mathematics, Business, Economics, Computer Science
Identifiers
DOI 10.1214/08-AOAS172 arXiv 0805.0463
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Functional Data Analysis for Sparse Auction Data
2008cited by this paper
DESCRIBING DISABILITY THROUGH INDIVIDUAL-LEVEL MIXTURE MODELS FOR MULTIVARIATE BINARY DATA.
2007cited by this paper
The BARISTA: A model for bid arrivals in online auctions
2007cited by this paper
Studying Heterogeneity of Price Evolution in eBay Auctions via Functional Clustering
2006cited by this paper
Functional Data Analysis in Electronic Commerce Research
2006cited by this paper
Smoothing sparse and unevenly sampled curves using semiparametric mixed models: An application to online auctions
2006cited by this paper
Functional Modelling and Classification of Longitudinal Data *
2005influential reference
Functional Data Analysis for Sparse Longitudinal Data
2005cited by this paper
Visualizing Online Auctions
2005cited by this paper
Functional Modelling and Classification of Longitudinal Data.
2005cited by this paper
Bayesian Mixed Membership Models for Soft Clustering and Classification
2004cited by this paper
FUNCTIONAL AND LONGITUDINAL DATA ANALYSIS: PERSPECTIVES ON SMOOTHING
2004cited by this paper
Mixed-membership models of scientific publications
2004cited by this paper
The functional data analysis view of longitudinal data
2004cited by this paper
User heterogeneity and its impact on electronic auction market design: an empirical exploration
2004cited by this paper
Clustering for Sparsely Sampled Functional Data
2003cited by this paper
Shrinkage Estimation for Functional Principal Component Scores with Application to the Population Kinetics of Plasma Folate
2003cited by this paper
Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves
2001cited by this paper
Principal component models for sparse functional data
1999cited by this paper
An Analysis of Paediatric Cd4 Counts for Acquired Immune Deficiency Syndrome Using Flexible Random Curves
1996cited by this paper
The Solution of the Metric STRESS and SSTRESS Problems in Multidimensional Scaling Using Newton's Method
1995cited by this paper
Local polynomial modelling and its applications
1994cited by this paper
Estimating the mean and covariance structure nonparametrically when the data are curves
1991cited by this paper
Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features
1977cited by this paper
Real analysis and probability
1975cited by this paper
A Nonlinear Mapping for Data Structure Analysis
1969cited by this paper
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
1964cited by this paper

CITED BY

A Study of the Colombian Stock Market with Multivariate Functional Data Analysis (FDA)
2025cites this paper
Learning under commission and omission event outliers
2025cites this paper
RSFclust: Robust sparse clustering of functional data using quantile curves
2025cites this paper
Curve Clustering via Pairwise Directions Estimation
2025cites this paper
funOCLUST: Clustering Functional Data with Outliers
2025cites this paper
Functional Data Clustering Based on Weighted Functional Spatial Ranks With Clinical Applications
2024cites this paper
Functional Principal Component Analysis for Multiple Variables on Different Riemannian Manifolds
2024cites this paper
Functional clustering for longitudinal associations between social determinants of health and stroke mortality in the U.S.
2024cites this paper
Tests for equality of several mean vector functions for multivariate functional data with applications
2024cites this paper
On Robust Clustering of Temporal Point Process
2024cites this paper
Addressing class imbalance in functional data clustering
2024cites this paper
Functional Mixed-type Clustering of Investors' Daily Returns During a Market Shock Change-point and Recovery
2024cites this paper
Functional Projection K-means
2024cites this paper
Clustering and forecasting of day-ahead electricity supply curves using a market-based distance
2024cites this paper
Functional Data Clustering Method Based on Shape Information and Functional Mahalanobis Distance
2024cites this paper
Functional autoencoder for smoothing and representation learning
2024cites this paper
Multivariate Functional Clustering with Variable Selection and Application to Sensor Data from Engineering Systems
2024cites this paper
Clustering Longitudinal Data: A Review of Methods and Software Packages
2024cites this paper
Detection and estimation of structural breaks in high-dimensional functional time series
2023cites this paper
Bayesian Semiparametric Local Clustering of Multiple Time Series Data
2023cites this paper
Similarity‐based clustering for patterns of extreme values
2023cites this paper
Evaluating Prognostic Value of Dynamics of Circulating Lactate Dehydrogenase in Colorectal Cancer Using Modeling and Machine Learning
2023cites this paper
Multivariate functional data clustering using adaptive density peak detection
2023cites this paper
Mimetic Muscle Rehabilitation Analysis Using Clustering of Low Dimensional 3D Kinect Data
2023cites this paper
Clustering multivariate functional data based on new definitions of the epigraph and hypograph indexes.
2023cites this paper
Penalized model-based clustering of complex functional data
2023cites this paper
A Novel Curve Clustering Method for Functional Data: Applications to COVID-19 and Financial Data
2023cites this paper
Robust Two-Layer Partition Clustering of Sparse Multivariate Functional Data
2022cites this paper
Functional data clustering via information maximization
2022cites this paper
Machine learning approach for study on subway passenger flow
2022cites this paper
Functional Mixed Effects Clustering with Application to Longitudinal Urologic Chronic Pelvic Pain Syndrome Symptom Data
2022cites this paper
Large-scale generalized linear longitudinal data models with grouped patterns of unobserved heterogeneity
2022cites this paper
Functional Nonlinear Learning
2022cites this paper
Large-Scale Generalized Linear Models for Longitudinal Data with Grouped Patterns of Unobserved Heterogeneity
2022cites this paper
Extreme quantile estimation for partial functional linear regression models with heavy‐tailed distributions
2021cites this paper
Two-sample inference for sparse functional data
2021cites this paper
A fast epigraph and hypograph-based approach for clustering functional data
2021cites this paper
ECONOMIC PAPERS
2021cites this paper
Cluster analysis with regression of non‐Gaussian functional data on covariates
2021cites this paper
Functional regression clustering with multiple functional gene expressions
2021cites this paper
Biclustering analysis of functionals via penalized fusion
2021cites this paper
Clustering-based simultaneous forecasting of life expectancy time series through Long-Short Term Memory Neural Networks
2021cites this paper
Row-clustering of a Point Process-valued Matrix
2021cites this paper
Distributional Representation of Longitudinal Data: Visualization, Regression and Prediction
2021cites this paper
Clustering of Pain Dynamics in Sickle Cell Disease from Sparse, Uneven Samples
2021cites this paper
Outlier detection in multivariate functional data through a contaminated mixture model
2021cites this paper
A Partition Dirichlet Process Model for Functional Data Analysis
2020cites this paper
Functional data analysis: An application to COVID-19 data in the United States
2020cites this paper
Sparse functional principal component analysis in a new regression framework
2020cites this paper
The Importance of Rural Social Productive Space to Increase the Social Capital of Agribusiness Community in Agropolitan Area
2020cites this paper
Pseudo-quantile functional data clustering
2020cites this paper
Modeling and Regionalization of China’s PM2.5 Using Spatial-Functional Mixture Models
2020cites this paper
A generalization of functional clustering for discrete multivariate longitudinal data
2020cites this paper
The approximation algorithm based on seeding method for functional $ k $-means problem†
2020cites this paper
Cluster non‐Gaussian functional data
2020cites this paper
Clustering and modelling of phase variation for functional data
2019cites this paper
Clustering of longitudinal curves via a penalized method and EM algorithm
2019cites this paper
Supervised classification of geometrical objects by integrating currents and functional data analysis
2019cites this paper
Clustering Functional Data with Application to Electronic Medication Adherence Monitoring in HIV Prevention Trials
2019cites this paper
The Seeding Algorithm for Functional k-Means Problem
2019cites this paper
Profile clustering in clinical trials with longitudinal and functional data methods
2019cites this paper
A UNIFIED FRAMEWORK FOR ANALYZING AGGREGATE AND ISSUE-SPECIFIC PREFERENCE FROM NON-VOTING DATASETS: COALITIONAL ITEM RESPONSE THEORY MODEL
2018cites this paper
Clustering Analysis on Locally Asymptotically Self-similar Processes with Known Number of Clusters
2018cites this paper
A new distance with derivative information for functional k-means clustering algorithm
2018cites this paper
Functional Data Analysis and Knowledge-Based Systems
2018cites this paper
Generalization, Combination and Extension of Functional Clustering Algorithms: The R Package funcy
2018influential citation
Recovering the underlying trajectory from sparse and irregular longitudinal data
2018cites this paper
A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing
2018cites this paper
Covariance-based dissimilarity measures applied to clustering wide-sense stationary ergodic processes
2018cites this paper
Pan-disease clustering analysis of the trend of period prevalence
2018cites this paper
Similarity-based clustering of extreme losses from the London Stock Exchange
2018cites this paper
Functional Data Analysis in Sport Science: Example of Swimmers’ Progression Curves Clustering
2018cites this paper
Selected statistical methods of data analysis for multivariate functional data
2018cites this paper
On Clustering Non-smooth Functional Observations
2018cites this paper
Simultaneous registration and modelling for multi-dimensional functional data
2018cites this paper
Highly irregular functional generalized linear regression with electronic health records
2018cites this paper
Development of functional principal components analysis and estimating the time-varying gene regulation network
2018cites this paper
Clustering Analysis on Locally Asymptotically Self-similar Processes
2018cites this paper
A concentration inequality based statistical methodology for inference on covariance matrices and operators
2017cites this paper
Gaussian Process and Functional Data Methods for Mortality Modelling
2017cites this paper
Time and frequency domain statistical methods for high-frequency time series
2017cites this paper
Characterizing early child growth patterns of height-for-age in an urban slum cohort of Bangladesh with functional principal component analysis
2017cites this paper
Clustering multiply imputed multivariate high‐dimensional longitudinal profiles
2017cites this paper
Non-Parametric Priors for Functional Data and Partition Labelling Models
2017cites this paper
Modeling and clustering water demand patterns from real-world smart meter data
2017cites this paper
Asymptotic properties of principal component projections with repeated eigenvalues
2017cites this paper
Functional Regression on Manifold with Contamination
2017cites this paper
Clinical and Psychosocial Predictors of Urologic Chronic Pelvic Pain Symptom Change Over One Year: A Prospective Study from the MAPP Research Network
2017cites this paper
Contribution of Functional Approach to the Classification and the Identification of Acoustic Emission Source Mechanisms
2017cites this paper
Selected statistical methods of data analysis for multivariate functional data
2016cites this paper
Clustering functional data on convex function spaces
2016cites this paper
Clustering multivariate and functional data using spatial rank functions
2016influential citation
Functional data analysis by matrix completion
2016cites this paper
Benchmarking different clustering algorithms on functional data
2016cites this paper
Functional Data Analysis
2016cites this paper
Comparing Dissimilarity Measures: A Case of Banking Ratios
2016cites this paper
Inference on Covariance Operators via Concentration Inequalities: k-sample Tests, Classification, and Clustering via Rademacher Complexities
2016cites this paper
Whole‐volume clustering of time series data from zebrafish brain calcium images via mixture modeling
2016cites this paper
Recent advances in functional data stream classification
2015cites this paper
Review of Functional Data Analysis
2015cites this paper