Course Syllabus

MATH 3280 Data Mining

Division: Natural Science and Math
Department: Mathematics
Credit/Time Requirement: Credit: 3; Lecture: 3; Lab: 0
Prerequisites: Prerequisites: Math 3080 and (either Math 2270 or Math 2250) with a C or better in each course.

Semesters Offered: Fall
Semester Approved: Spring 2020
Five-Year Review Semester: Summer 2025
End Semester: Fall 2025

Optimum Class Size: 20
Maximum Class Size: 25

Course Description

Students will learn to efficiently find structures and patterns in large data sets. Topics will include acquiring data sets and cleaning messy and noisy raw data sets into structured and abstract forms; applying scalable and probabilistic algorithms to these well-structured abstract data sets; and, formally modeling and analyzing the error inherent in these methods. Students will consider data representations and trade-offs between accuracy and scalability.

Justification

Data collection and the analysis of data is ubiquitous and fast becoming a prerequisite to economic success for businesses. This course is a necessity for any data scientist. This course will support the bachelor’s in software engineering degree by providing relevant mathematics coursework.

Student Learning Outcomes

Students will understand the basic data structures in Python (or similar software)
Students will understand how to visualize, explain, and present data using Python (or similar software)
Students will understand how to use web scraping tools, APIs, and other methods to acquire data.
Students will understand how to clean, structure, and explore data using Python (or similar software)

Course Content

This course will include a survey of data acquisition and cleaning tools; similarity search, clustering, regression/dimensionality reduction, graph analysis, PageRank, and small space summaries; and, recent developments and the application of these topics to modern applications, often relating to large internet-based companies.