I am currently a statistician manager at Daiichi Sankyo. I got my Phd in applied statistics at SUFE. My research interests includes parallel and distributed algorithm design, quantile regression and machine learning. I served at the Data Clinic(a data analysis and consulting platform supported by School of Statistics and Management, SUFE) for two years contributing to two industrial deep learning projects mainly focused on computer vision, and one assesst management project for a securities firm. At my part time, I like to play computer games.
PhD in Applied Statistics, 2022
Shanghai University of Finance and Economics
MS in Statistics, 2017
Shanghai University of Finance and Economics
BS in Applied Mathematics, 2013
Shanghai Jiao Tong Universiy
90%
100%
50%
60%
This package carries out the Robust Subgroup Analysis and Variable Selection simultaneously.
利用机器视觉实现车辆年审时的车架号自动识别
利用机器视觉实现基于光伏组件EL图像的虚焊缺陷检测
Cancer heterogeneity plays an important role in the understanding of tumor etiology, progression, and response to treatment. To accommodate heterogeneity, cancer subgroup analysis has been extensively conducted. However, most of the existing studies share the limitation that they cannot accommodate heavy-tailed or contaminated outcomes and also high dimensional covariates, both of which are not uncommon in biomedical research. In this study, we propose a robust subgroup identification approach based on M-estimators together with concave and pairwise fusion penalties, which advances from existing studies by effectively accommodating high-dimensional data containing some outliers. The penalties are applied on both latent heterogeneity factors and covariates, where the estimation is expected to achieve subgroup identification and variable selection simultaneously, with the number of subgroups being apriori unknown. We innovatively develop an algorithm based on parallel computing strategy, with a significant advantage of capable of processing large-scale data. The convergence property of the proposed algorithm, oracle property of the penalized M-estimators, and selection consistency of the proposed BIC criterion are carefully established. Simulation and analysis of TCGA breast cancer data demonstrate that the proposed approach is promising to efficiently identify underlying subgroups in high-dimensional data.
It becomes an interesting problem to identify subgroup structures in data analysis as populations are probably heterogeneous in practice. In this paper, we consider M-estimators together with both concave and pairwise fusion penalties, which can deal with high-dimensional data containing some outliers. The penalties are applied both on covariates and treatment effects, where the estimation is expected to achieve both variable selection and data clustering simultaneously. An algorithm is proposed to process relatively large datasets based on parallel computing. We establish the convergence analysis of the proposed algorithm, the oracle property of the penalized M-estimators, and the selection consistency of the proposed criterion. Our numerical study demonstrates that the proposed method is promising to efficiently identify subgroups hidden in high-dimensional data.
The Best Way to Create the Website You Want from Markdown (or Jupyter/RStudio)
Build Anything with Widgets
Welcome to the personal demo of Academic. Other demos available include:
Over 100,000 Amazing Websites have Already Been Built with Academic
Join the Most Empowered Hugo Community
This homepage section is an example of adding elements to the Blank widget.
Backgrounds can be applied to any section. Here, the background option is set give an image parallax effect.