Class Overview

Introduction

Welcome to Data Science Tools and Programming. This module is designed to give you a general overview of the course. In addition it will go over topics that you are expected to know coming into the course and give guidance on setting up your computer and the cloud based tools which will be needed in this course.

Key Questions

  • Am I ready for this class?
  • Python Review
    • What are the basic control flow tools in Python?
    • How do classes work in Python?
    • How do you read and write from files in Python?
    • How do you install and use 3rd party libraries in Python?
  • Other than a large snake, what is Anaconda?
  • Where is the Google Cloud SDK used?

Assignment Overview

There are two sets of tasks this week. The first is to get your environment set up to be able to be successful in the later stages of this class. This may not be a small undertaking if you run into configuration issues, so make sure to start early. It should be as easy as installing a few programs, but when things go wrong it can be very hard to figure out way.

The other task is to demonstrate and review your basic Python programming. You may need to look up a few things to remember syntax, but in general nothing in the programming activities should be new or confusing to you. It is simply review. If you do find any portion of it difficult, that is a good indication that you are going to have to do a fair amount of review otherwise you will find the later portions of the class really difficult.

Explore the Topics

Course Overview
This exploration is designed to help orient you to the proper course expectations and give you a basic road map of the course.
Python Review
This exploration is here to help you get reacquainted with Python and to help you identify any gaps in your Python knowledge. It will be the primary programming language we use in this course.
Anaconda Overview
This exploration will let you use and set up Anaconda. This is a powerful tool that can help manage different versions of Python and to better keep track of 3rd party packages for use with Python.
GCP Overview
This exploration will help you get oriented in the not unsubstantial Google Cloud universe. There are a wide variety of tools they offer but they all require an account and a basic set of tools installed. This will go over some account specifics and offer some advice on installing the required tools.

Additional Resources

Course Syllabus
If you haven’t run into it yet, here is the syllabus, make sure to read it, it has important information! a
Think Python
This is a good book that has a fantastic list of all of the topics you might need in Python. It is a little dense for first learning it, but it is a great place to review the various concepts.
Google Cloud Documentation
From here you can get to all of the documentation for all of the Google Products. It is a bit overwhelming, so the individual sections will try to link to more specific documentation.
Conda User Guide
Regardless of which version of *conda you go with, you will need to use conda on the command line. This is the place to learn how to do it. Somewhat unintuitively, you should go to Tasks not Tutorials to learn how to do the basic stuff with conda. Tutorials is for writing software that conda can use. The Tasks section is about how to actually use the tool itself which is what we are concerned about.

Review

This week is always a bit unpredictable. If you came in, confident in your Python and software installation and use all went off without any technical issues, you could be in the situation of not having a whole lot to do. But, as is often the case, there might be some parts of Python you need to review or you might run into incredibly unhelpful error messages when trying to configure Anaconda or the Google Cloud SDK. These can sometimes take many hours of web searching and trial and error to debug.

Whichever camp you fall into, this week should have at least exposed you to using quite a few new third party tools. That is not going to change for the rest of the class. You might end up reusing a lot of these same tools, but expect to always be learning about new tools and libraries to collect, change, visualize or analyze large sets of data.