• Click:

邵俊明

Personal Profile

Junming Shao Prof. Dr.

Head of Data Mining Lab
Big Data Research Center
School of Computer Science & Engineering
University of Electronic Science and Technology of China

No.2006, Xiyuan Ave, West Hi-Tech Zone,
Chengdu 611731, Sichuan, P.R.China

E-Mail: junmshao@uestc.edu.cn, junming.shao@gmail.com

Research Interests:

Subspace Clustering, Community Detection, Data Stream Clustering and Classification, Brain Network Mining, Machine Learning

Education Background:

2008/09 – 2011/11, University of Munich (Germany), PhD
2011/11 – 2012/07, Technical University of Munich (Germany), PostDoc.
2011/08 – 2012/12, University of Mainz (Germany), Alexander von Humboldt Fellow
2013/12 – Present, Unversity of Electronic Science and Technology of China (China) ,  Prof.

Awards:

Best Paper Award in ICDM 2010, Workshop on Biological Data Mining and its Applications in Healthcare. (Junming Shao, et al., Combining Time Series Similarity with Density-based Clustering to Identify Fiber Bundles.)
Best Article for IGI Global’s “Fourth Annual Excellence in Research Journal Awards” 2010. (Junming Shao, et. al, Hierarchical Density-based Clustering of White Matter Tracts in the Human Brain.)

Selected Talks and Presentations

Give a talk for the 2nd Big Data Seminar at UESTC, Chengdu, Nov. 2013. Slides
Give a talk for the 1st Big Data Seminar at UESTC, Chengdu, 2013. Slides
Invited talk at Northwest A&F University, Yangling, 2013.
Attend the IEEE International Conference on Data Mining (ICDM) in Brussel and give a talk, 2012.
Attend the conference Organization for Human Brain Mapping (OHBM) and present a poster, Beijing, 2012.
Talk at IEEE International Conference on Data Mining (ICDM), Vancouver, 2011.
Presentation at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Barcelona, 2010.
Talk at ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Washington, 2010.
Give a talk at European Workshop on Mining Massive Data Sets (EMMDS), Denmark, 2009.
Give a talk at International Conferences on Computational Intelligence for Modeling, Control and Automation (CIMACA), Vienna, 2008.

Acitivities

Revewing on data mining related journals, such as IEEE Transaction on Knowledge and Data Engineering (TKDE), Artifical Intelligence , Chaos, World Scientific, Database Technology for Life Sciences and Medicine, International Journal of Computer Mathematics, etc. Also as the PC memeber for major data mining conferences, such as ECML/PKDD.

Research Topics:

― Clustering (scalable/subspace/hierarchical/parameter-free clustering)
― Data stream mining (Concept drift detection/clustering/classification)
― Brain network mining and applications (Mining on fMRI/DTI/EEG brain data)
― Multi-source heterogeneous data mining


Recent Projects

1. Prototype-based Learning on Concept-drifting Data Streams

 

Learning on Concept-drifting Data Streams

Related references:

●    Shao, J., Ahmadi, Z. and Kramer, S.:Prototype-based Learning on Concept-drifting Data Streams, Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pp. 412-421. 2014. [Code]    

●    Shao, J., He, X., Boehm, C., Yang, Q. and Plant, C.:Synchronization-inspired Partitioning and Hierarchical Clustering, IEEE Transactions on Knowledge and Data Engineering, 25(4): 893-905. 2013.    

2. Synchronization-inspired Data Mining [KDD 2010, PKDD 2010, ICDM 2011,  IEEE TKDE 2012, PAKDD 2013]

Overview

Synchronization is a powerful concept regulating a large variety of complex processes rangingfrom the metabolism in a cell to opinion formation in a group of individuals.  Synchronization phenomena in nature have been widely investigated (e.g. :flashing fireflies, crickets, yeast, etc.) and models concisely describing the dynamical synchronization process have been proposed. Here, inspired by the synchronization phenomena, we introduce it into data mining domain, and have proposed several data mining algorithms. These algorithms shows several desirable properties compared to the state-of-the-art algorithms.

Basic Idea: Uncover the data structure by investigating the dynamics of objects during the process towards Synchronization. Specifically, (a) Each data object/node is regarded as a phase oscillator; (b) It interacts with its neighbors through an Interaction Model in a local fashion; (c) Simulate dynamic behaviors of objects over time , where regular objects synchronize together and form distinct clusters, and outliers/ noisy objects tend to remain stable all the time.

 

Clustering by Synchronization


 


Outlier/Noisy object Detection

Related references:

●    Shao, J., He, X., Boehm, C., Yang, Q. and Plant, C.:Synchronization-inspired Partitioning and Hierarchical Clustering, IEEE Transactions on Knowledge and Data Engineering, 25(4): 893-905. 2013. [Code]    

●    Shao, J., He, X., Yang, Q., Plant, C. and Boehm, C.:Robust Synchronization-Based Graph Clustering, 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 249-260, 2013. [Code]    

●    Shao, J:Synchronization on Data Mining, LAP LAMBERT Academic Publishing, 2012.    

●    Shao, J., Yang, Q., Boehm, C. and Plant, C.:Detection of Arbitrarily Oriented Synchronized Clusters in High-dimensional Data, IEEE International Conference on Data Mining (ICDM), pp. 607-616, 2011. [Code]    

●    Boehm, C., Plant, C., Shao, J.* and Yang, Q.:Clustering by synchronization, Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2010), 583-592, 2010. [Code]    

●    Shao, J., Boehm, C., Yang, Q. and Plant, C.:Synchronization Based Outlier Detection, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2010), 245-260, 2010. [Code]    

Data Mining on Brain Data

1. Fiber Clustering

Chanllenges –  Effective Fiber Similarity Measure (Dynamic Time Wraping),   Clustering on Multiple Scales (Hierarchical clustering) –  Noisy Fibers Handling (Outlier robustness)


      

Hierarchical Fiber Clustering [IJDB 2010] [Best Paper Award]

Related Papers:

●    Shao, J., Hahn, K., Yang, Q., Wohlschlaeger, A., Boehm, C., Myers, N. and Plant, C.:Hierarchical Density-based Clustering of White Matter Tracts in the Human Brain, International Journal of Knowledge Discovery in Bioinformatics 1(4), 1-26, 2010. [Code]    

●    Shao, J., Hahn, K., Yang, Q., Boehm, C., Wohlschlaeger, A., Myers, N. and Plant, C.:Combining Time Series Similarity with Density-Based Clustering to Identify Fiber Bundles in the Human Brain, Proceedings of International Conference on Data Mining (ICDM), Workshop on Biological Data Mining and its Applications in Healthcare, 747-754, 2010.    

●    Shao, J, Wohlschläger, A, Hahn, C, Boehm, C, and Plant, C.:Density-based Clustering of White Matter Tracts in the Human Brain with Dynamic Time Warping, European Workshop on Mining Massive Data Sets (EMMDS ), pp. 1101-1108,2009.    

2. Disease Diagnosis

 

ISCN-based AD prediction. [Shao et. al., Neurobiology of Aging, 2012]

Related Papers:

●    Shao, J., Myers, N., Yang, Q., Feng, J., Plant, C., Böhm, C., Förstl, H., Kurz, A., Zimmer, C., Meng, C., Riedl, V., Wohlschläger, A. and Sorg, C.:Prediction of Alzheimer’s disease using individual structural connectivity networks, Neurobiology of Aging, 33(12):2756-2765, 2012.    

●    Meng, C., Brandl, F., Tahmasian, M., Shao, J., Manoliu, A., Scherr, M., … & Sorg, C.:Aberrant topology of striatum’s connectivity is associated with the number of episodes in depression, Brain 2014: 137; 598–609.    

●    Shao, J., Hahn, K., Yang, Q., Wohlschlaeger, A., Boehm, C., Myers, N. and Plant, C.:Hierarchical Density-based Clustering of White Matter Tracts in the Human Brain, International Journal of Knowledge Discovery in Bioinformatics 1(4), 1-26, 2010.    

●    Shao, J, Yang, Q, Wohlschlaeger, A, and Sorg, C.:Insight into Disrupted Spatial Patterns of Human Connectome in Alzheimer’s Disease via Subgraph Mining, International Journal of Knowledge Discovery in Bioinformatics, 3(1):14-29, 2013.    

●    Tahmasian, M., Knight, D. C., Manoliu, A., Schwerthöffer, D., Scherr, M., Meng, C., … & Sorg, C.:Aberrant intrinsic connectivity of hippocampus and amygdala overlap in the fronto-insular and dorsomedial-prefrontal cortex in major depressive disorder, Frontiers in human neuroscience, 7, 2013.    

●    Shao J., Yang Q., Wohlschlaeger A. and Sorg C.:Discovering Aberrant Patterns of Human Connectome in Alzheimer’s Disease via Subgraph Mining, IEEE International Conference on Data Mining (ICDM), Workshop on Biological Data Mining and its Applications in Healthcare (BioDM), pp. 86-93, 2012.    

Current Projects

Community Detection: Instead of optimizing some user-defined criteria, we consider community detection from a new point of view: local distance dynamics.The basic idea is to envision a network as a dynamical system, and each agent interacts with its local partners. Instead of investigating the node dynamics, we actually examine the change of “distances” among linked nodes. As time evolves, these distances will be shrunk or stretched gradually based on their topological structures. Finally all distances among linked nodes will converge into a stable pattern, and communities can be intuitively identified. [see http://arxiv.org/abs/1409.7978]

LocalSVM: This project is mainly about the improving of SVM. We parttion the whole data set into the global data which will be classified by libSVM and local data which will be classified by KNN or other classfication method with good local generalization ability. In this way ,we hope to improve the local generalization ability of the SVM with the global generalization ability hold.

Distributed Data Stream Classification: Our project focuses on how to learn the association among concept drift data streams, aiming to combine information from local streams to get better prediction. The project is divided into local learning part and global learning part. From local part, we build P-Tree to maintain import data for each stream and use Error-driven Representativeness Learning to update P-tree due based on weight vector of data. We also use PCA and Statistical Analysis to capture abrupt concept-drifting. From global part, we using Weighted Majority to get final prediction. This learning method provides a new learning framework based on typical data which reflect streams association.


Personal Information

Doctoral Supervisor

Gender:Male

Education Level:With Certificate of Graduation for Doctorate Study

Degree:Doctor of Science

Status:On the job

Email:

Educational Experience

    No Content

Work Experience

No Content

Research Focus

    No Content

Social Affiliations

    No Content

Research Group

No Content

Other Contact Information

No Content