We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer's disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures.