Curriculum


MS in DS


Overview

All students in the MS in DS Program are required to complete a total of 30 credit hours.

Curriculum Overview

 

 

 


Core Courses

All students in the MS in DS Program are required to complete a total of 12 credit hours of core courses.

Curriculum Core


Elective Courses

All students in the MS in DS Program are required to complete a total of 15 credit hours of elective courses. The list of courses is given below.  Please note that courses outside from the lists can be recommended by an Academic Advisor and approved by the Department.

Curriculum Electives 


Capstone Course

All students in the MS in DS Program are required to complete a total of 3 credit hours Capstone Course.

Curriculum Capstone


Course Descriptions

DATA 5050 Mathematics for Data Science. (3) This course covers various areas of applied mathematics relevant to data science. Topics from statistics include probability distribution functions, linear regression, and probability calculus, topics from linear algebra include vector spaces, subspaces, matrix factorization concepts, singular value decomposition, and principal component analysis, topics from discrete mathematics include combinatorics, and topics from optimization include convex optimization and constrained programming. Prerequisite: MATH 2010 or MATH 2050 or MATH 3100 or MATH 3610 or equivalent.

DATA 5350 Applied Statistics for Data Science. (3) This course explores the main topics of statistical inference, point and confidence interval estimation, hypothesis tests, maximum likelihood and estimators, unbiased estimators, linear regression, logistic regression, model selection, principal component analysis. Emphasis on how to identify the correct technique for a given problem, computer packages for its computation, and how to interpret the results. Prerequisite: DATA 5050 or MATH 3100 or STAT 3110 or equivalent.

DATA 5100 Programming for Data Science. (3) This is a project-based programming course for data scientists. The course covers the fundamentals of Python programming, the use of Python libraries for information processing and data analysis techniques to solve real-world data science problems. This course is intended for students with prior computer programming experience. Prerequisite: None. 

DATA 5200 Statistical Learning. (3) This course provides the statistical learning theory. It provides the theoretical basis for many of today's machine learning algorithms. It covers the approaches to learning problems, estimation of the probability measure and problem of learning, conditions for consistency of empirical risk minimization principle, bounds on the risk for indicator loss functions, structural risk minimization principle, stochastic ill-posed problems, estimating the values of function at given points, perceptron and their generalization, support vector methods for estimating indicator and real-valued functions, and support vector machines for pattern recognition. Prerequisite: DATA 5050.

DATA 5300 Data Mining. (3) This course presents exploratory techniques and analytical models for discovering meaningful patterns and insightful future trends in data. It covers descriptive and predictive data mining models for information retrieval and forecasting. Best practices for data mining are discussed in topics including, data preparation and cleaning, data transformation and manifold learning, data fusion, data modeling for knowledge representation (graph and decision trees), data importance ranking and selection, data-driven model evaluation and interpretation for decision making. Examples are drawn from web mining, spatial and time-series data mining, anomaly detection, and learning from big data sets. Pre-requisite: DATA 5050 and DATA 5100.

DATA 5400 Algorithms for Data Science. (3) The course providing a survey of data structures and computer algorithms, examines fundamental techniques in algorithm design and analysis, and develops problem-solving skills required in all programs of study involving data science. The topics include data structures for data science, linear and logic regression, clustering, dimensionality reduction, artificial neural networks, market basket analysis, classification and network analysis, and recommendation systems. Prerequisite: DATA 5100.

DATA 5500 Business Data Analytics. (3) This course presents key topics related to using business data for analysis, especially at enterprise-scale.  Topics will include market analysis, user and customer behavior analysis, sentiment analysis, and data-driven decisions. Students will complete practical project utilizing tools for large-scale data analytics. Prerequisite: DATA 5050 and DATA 5100.

DATA 5900 Special Topics. (3)  This course is for teaching important emerging data science topics that are not covered in other data science courses. Prerequisite: Successful completion of at least 6 hours of data science graduate courses.

DATA 6100 Natural Language Processing. (3) The course presents algorithms and techniques for processing, numeric encoding, and mining of human language expressions. This course introduces prominent text modeling and numeric encoding techniques, including bag-of-words, frequency-based features, lexicon-based models, and word embedding models. Topics of text analytics include text processing and normalization, text classification and clustering, sentiment analysis on social media, topic modeling and document summarization, graph-based mining of large document, deep learning of text sequence, and machine translation. This course shares concepts from linguistics, statistics, and computer science. Pre-requisite: DATA 5050 and DATA 5100.

DATA 6150 Applied Deep Learning. (3) This course introduces deep convolutional neural networks with applications in computer vision, speech analysis, text understanding, medical imaging, and data mining. It analyzes various convolutional neural network architectures as effective feature extractors that are fed into fully connected neural networks. Prerequisite: DATA 5050 and DATA 5100.

DATA 6200 Data Science Capstone. (3) This is a project-based course that allows students to work with a faculty or industry mentor to address data-driven problems. Students are expected to apply their knowledge and foundation in data science to analyze and solve problems in various areas of data science. Prerequisite: This course should be taken in last semester.

COMP 5400 Hybrid and Relational Databases. (3) This course presents relational, object-oriented, and hybrid database concepts. Topics include: definitions of objects and attributes, methods and messages, classes, object-oriented data models, architectural issues, the object-oriented database system manifesto, object-oriented database design, object-oriented database management systems, and object/relational database management systems. Prerequisite: None.

COMP 5800 Introduction to Bioinformatics. (3) Bioinformatics is an interdisciplinary field in which biology and computer science merge. This course is designed to introduce students with concepts, methods and tools to analyze biological problems, prepare students with skills necessary to communicate across the fields of computer science and biology. Topics include (but not limited to) biological sequence and literature databases, strategies to search these databases to solve fundamental biological problems, principle and algorithms used for processing and analyzing biological information.

COMP 5850 Data Visualization. (3) This course is an introduction to data visualization and the graphical representation of data. The growing data deluge from multiple sources require skills in representing data, in order to extract meaning and actionable intelligence from these data sets.  Students learn how to communicate the relationship between data through systematic mapping between graphical representations and the underlying data values. The class teaches how representations of data can give insight and make data analysis easier. Prerequisite: None.

COMP 6200 Machine Learning. (3) This course provides a broad introduction to machine learning, data-mining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course will also draw from numerous case studies and applications, so that you'll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas. Prerequisite: ENGR 5100 or Equivalent.

COMP 6400 Distributed Algorithm Design and Data Analysis. (3) The course introduces the computing models and algorithms of distribution systems. The course also exposes students to an array of big data analysis theories, techniques and practices in different fields of study using distributed models. The topics include distributed computing models, massage-passing and shared memory systems, design and analysis of synchronous and asynchronous algorithms, fault tolerance, and data distribution, collection, processing and analysis in distributed systems. This is a project-based course that provides students with hands-on experience on distributed computing with different data types. Prerequisite: COMP 5520/5200.

COMP 6800 Introduction to Computer Vision. (3) This course introduces the concepts and applications in computer vision.  Topics include: cameras and projection models, low-level image processing methods such as filtering and edge detection; mid-level   vision topics such as segmentation and clustering; shape reconstruction from stereo, as well as high-level vision tasks such as object recognition, scene recognition, face detection and human motion categorization. Prerequisite: ENGR 5100 or Equivalent.