Pattern recognition techniques are used to automatically classify physical objects (handwritten characters, tissue samples, faces) or abstract multidimensional patterns (n points in d dimensions) into known or possibly unknown number of categories. A number of commercial pattern recognition systems are available for face recognition, character recognition, document classification, fingerprint classification, speech and speaker recognition, military target recognition, robotic vision, information retrieval, data mining, computational linguistics, forensics, biometrics, medical image analysis, etc. Most machine (computer) vision systems employ pattern recognition techniques to identify objects for sorting, inspection, and assembly.
The design of a pattern recognition system consists of following main modules: (i) sensing, (ii) feature extraction, (iii) decision making, and (iv) performance evaluation. The availability of low cost and high resolution sensors (e.g., digital cameras, microphones and scanners) and data sharing over the Internet have resulted in huge repositories of digitized documents (text, speech, image and video). Need for efficient archiving and retrieval of this data has fostered the development of pattern recognition algorithms in new application domains (e.g., text, image and video retrieval, bioinformatics, and face recognition).
A pattern recognition system can be designed based on a number of different approaches: (i) template matching, (ii) geometric (statistical) methods, (iii) structural (syntactic) methods, and (iv) neural (deep) networks. This course will introduce the fundamentals of statistical pattern recognition with examples from several application areas. The course will cover techniques for visualizing and analyzing multi-dimensional data along with algorithms for projection, dimensionality reduction, clustering and classification. The course will present various approaches to classifier design so students can make judicious choices when confronted with real pattern recognition problems. It is important to emphasize that the design of a complete pattern recognition system for a specific application domain (e.g., remote sensing) requires domain knowledge, which is beyond the scope of this course. Students will use available MATLAB tools and will be expected to implement some algorithms using their choice of a programming language.
Pattern recognition (PR) has close relationship with machine learning (ML) and both have overlaps with each other. Most PR methods apply machine learning for classification, and most ML methods are developed for classification. However, PR focuses more on the theory of classification. ML put more emphasis on the theory of learning by machines. Classification is sometimes called prediction of the class to which a certain object (i.e. a name, age, category, protein, cell, …) belongs. This calls for algorithms that can assign the most likely label (discrete output) to an object, given one or more measurements on that object. For most interesting problems, the underlying physics are too complex to explicitly design such an algorithm. In such cases, often a machine learning approach is taken: an algorithm is constructed, with parameters that are tuned based on an available dataset of training examples. The algorithm should predict the labels for these examples as well as possible, yet still generalize, i.e. perform well on objects not seen before.
This course focuses on the underlying principles of pattern recognition and on the methods of machine intelligence used to develop and deploy pattern recognition applications in the real world. Emphasis is placed on the pattern recognition application development process, which includes problem identification, concept development, algorithm selection, system integration, and test and validation. Machine intelligence algorithms to be presented include parametric and non-parametric pattern detection and classification, Bayes classifier, clustering, artificial neural networks, support vector machines, rule-based algorithms, feature extraction and selection, dimension reduction, hidden Markov model, fuzzy logic, genetic algorithms, and others. Much of the topics concern statistical classification methods. They include generative methods such as those based on Bayes decision theory and related techniques of parameter estimation and density estimation. Next come discriminative methods such as nearest-neighbor classification, support vector machines. Artificial neural networks, classifier combination and clustering are other major components of pattern recognition.
The course is aimed at graduate and undergraduate students with a background in electrical engineering, computer science or a related field. Participants from the private sector are also welcome. A working knowledge of probability, statistics and linear algebra is assumed. A course in digital signal processing, digital image processing, or computer vision is recommended. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background.
The goal of this course is to introduce the student to the basic concepts and methods for the recognition of patterns in data. This is accomplished via the presentation of the underlying theory and algorithmic approaches for the detection and characterization of patterns in multi-dimensional data. This course will also provide the student with a working knowledge of the pattern recognition application development process which will be reinforced throughout the course via case studies using sample data from real world applications.
- Understand the concept of a pattern and the basic approach to the development of pattern recognition and machine intelligence algorithms
- Understand the basic methods of feature extraction, feature evaluation, and performance evaluation
- Understand and apply both supervised and unsupervised classification methods to detect and characterize patterns in real-world data
- Develop prototype pattern recognition algorithms that can be used to study algorithm behavior and performance against real-world multivariate data
- Introduction to patterns and pattern recognition
- Supervised pattern recognition : k-NN, Minimum distance classifier, Bayes classifier, SVM (Support Vector Machines), Adaboost, MLP, ...
- Clustering: K-means, ISODATA, hierarchical clustering.
- Supervised pattern recognition - advanced : rule-based classifiers, hidden Markov model.
- Unsupervised pattern recognition - advanced : Gaussian mixture model, self-organization.
- Feature extraction: PCA(Principal Component Analysis).
- Advanced topics : Deep learning, convolutional neural networks, Fuzzy logic, genetic algorithms, Sensor and data fusion, ...
Student Assessment Criteria
Evaluation of student's performance is based on a multitude of metrics, including reading reports, oral presentation, programming results, group collaboration, and presence. Programming skill based on Matlab/C/C++/Python is necessary to practice and implement pattern recognition methods. A final project will be assigned with paper reading, program coding, oral presentation and report writing. Project can be done by individuals or with team work. Some presentations and reports are evaluated by peer review.
Computer and Technical Requirements
- Language: Chinese/English
- Skill: Matlab, Python, or C/C++. Students will use the programming languages to build and test their own prototype solutions. Students may need access to a full version of MATLAB (with toolboxes) for homework assignments and case studies. Note that total academic licenses of MATLAB are available to the students from FJU.
- Instrument for homework: Desktop computer / Notebook
- Student performance is mostly evaluated by the reports of homeworks and project. The reports of homeworks and project are requested to be done with web page.
- No plagiary for reports and programs. (報告與程式不得抄襲，複製網頁資料之報告視同抄襲)
Textbooks / Reference Books
- (TK) S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4th Edition, Academic Press, 2009. [PDF download from FJU Library]
- (AMA) S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras, Introduction to Pattern Recognition: A Matlab Approach, Academic Press, 2010. [PDF download from FJU Library]
- (DH) R.O. Duda, P.E. Hart, D.G. Stork; Pattern Classification, Wiley, 2002.
- (BISHOP) C.M. Bishop, N.M. Nasrabadi. Pattern recognition and Machine Learning. Springer, 2006. [Free PDF download from author's web site]
- F. Van Der Heijden, R. Duin, D. De Ridder, D.M. Tax, Classification, Parameter Estimation and State Estimation: an engineering approach using MATLAB. John Wiley & Sons, 2005.
- A. K. Jain, R. Duin and J. Mao, "Statistical Pattern Recognition: A Review", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, pp, 4-37, Jan. 2000
- Y. LeCun, Y. Bengio, and G. Hinton. "Deep Learning" Nature 521, no. 7553 (2015): 436-444.
- A. K. Jain, "Data Clustering: 50 Years Beyond K-Means", Pattern Recognition Letters, Vol. 31, No. 8, pp. 651-666, June 2010.
- A. K. Jain, R. C. Dubes, "Algorithms for Clustering Data", Prentice-Hall, 1988