Skip to content

Brefew/syllabus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Syllabus — Data Engineering — S15

Syllabus for the Spring 2015 Data Engineering class at CU Boulder by Prof. Ken Anderson

Introduction

This course will investigate the software engineering issues involved with creating data-intensive software systems. We will look at how to support the entire data life cycle including collection, storage, analysis, reporting, and visualization and what tools and techniques are available for each of these stages. Students will help to create some of the content for this class by spiking on various technologies (either individually or in teams) and reporting back what they learned to the class as a whole. Students will work in teams to develop a prototype system that can support the entire data life cycle.

Expectations

Students need to be willing to learn new frameworks quickly and be willing to apply what they have learned to practical problems of data cleaning, manipulation, and analysis. I'm assuming that students have software engineering skills and are comfortable writing code in multiple programming languages. I expect that we'll be reading/writing code in at least Java, Ruby, Python, and Javascript as we look at a variety of tools and frameworks in the "big data" space.

Requirements

  • Students should have a laptop with them for every class. Most class sessions will involve hands-on coding or editing of source code, Markdown files, wikis, etc.

    • If you don't have a laptop then you'll need to find someone to work with during the class period.
  • A version of the ruby programming language should be installed on your laptop. Any version of ruby 2.x.x should work fine. (For instance, Prof. Anderson has 2.1.2 on his machine; 2.2.0 is the latest version of Ruby as of January 2015.)

  • A version of node should also be installed on your machine. If you have a Mac, the easiest way to install node is via Homebrew. Simply install Homebrew and then enter the command brew install node. The latest verion of node as of January 2015 is 0.10.35.

  • A version of curl should be installed on your machine. curl is a very useful utility for interacting with web services. If you have a Mac, curl should already be installed in /usr/bin. For other platforms, head over to (the curl website)[http://curl.haxx.se/download.html] to download and install the software.

  • Finally, you should be comfortable invoking the developer tools for your favorite web browser. Within Chrome, simply open a window and then invoke View->Developer->Developer Tools.

  • All of these instructions will be reviewed during the first day of lecture.

Learning Git

We will be using the Git version control system heavily this semester. I will be covering Git in class but there are plenty of excellent resources on the web to learn more. In particular, the company that creates the application Tower, has an excellent set of resources for learning Git.

Learning GitHub

We will also be making heavy use of GitHub and so I'll be covering it as well in class. As with Git, there are many resources available on-line to get up to speed with GitHub's features. Here are some pointers to get started:

About

Syllabus for the Spring 2015 Data Engineering class at CU Boulder by Prof. Ken Anderson

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors