CS 512

Syllabus – This is the course syllabus. It contains important course information from including things like grading and contact policies.
SQL and Relational Databases – This module will introduce the concept of relational databases and the basics of SQL. While not the newest or shiniest database engine around a great deal of query syntax derives from SQL and a huge number of applications use SQL databases for a backend.
Overview and Tools – This module is designed to get you up and running with the tools required for the class. It will also give you an idea of the topics that will be covered and what the general course layout will look like
Data Formats and Data Wrangling – When working we datasets you are likely to encounter many different formats of data and issues with the way data is saved. This section will give you the tools you need to start converting between formats and to clean up messy data.
Data Visualization – The basic principles and overview of tools used in data visualization.
Project Week – This week you should really make sure to get some time behind the keyboard working with your project and writing code. If you are already ahead with your project, start another harder project! Getting coding experience is critical. With only a 3 course sequence practice is really important to get these skills to stick.
Introduction to Big Data – This module kicks of the portion of the class dealing specifically with big data concepts. Thi module will go over the major concerns one must consider when working with big data.
Non-Relational Databases and Spark – This week goes over the concept of non-relational databases simply to make you aware of their existence. You need not worry too much about the actual use of them, you just need to be aware of the potential issues you can run into when working with them. We also get into the technical details of getting started with Apache Spark. This will be a multiweek process and this week is designed to get you up and running.
Spark Specifics – This module gets more into the nuts and bolts of Spark, within reason. We are still going to stay a ways away from actually configuring your own Spark server. But we will talk about reading and writing files along with partitioning data.
Spark Wrap Up – This module goes over DataFrames in Spark, they abstract a lot of the inner workings away and let us work with Spark as if it were a SQL table.
Odds and Ends at the End – This is the final week of the class. We will review what we covered in the class and talk about a couple additional topics of interest.