Data Analysis at Scale in the Cloud

Data Analysis at Scale in the Cloud

Course taught at Duke MIDS, Spring 2020-2022 by Noah Gift.

Guest Lecture 2022-Async

GPT 3:

  • Book: https://learning.oreilly.com/library/view/gpt-3/9781098113612/
  • Interview: https://learning.oreilly.com/videos/52-weeks-of/021822022VIDEOPAIML/
  • Shubham Saboo
  • Sandra Kublik

Prequel Material

These resources could be helpful before starting this course.

Duke/Coursera: Foundations of Data Engineering Course (Launching early 2022)

Course1: Python and Pandas for Data Engineering

Course2: Linux and Bash for Data Engineering

Github Repos for Projects in Course
Week1: Using Linux
Week2: Using Bash
Week3: Building Bash Scripts
Week4: Composing File and Data Management Solutions with Linux

Course3: Python and SQL for Data Engineering

Course4: Building Data Engineering Solutions with Python for Web Applications, Command-Line Tools and Notebooks

Sequel Material

These resources could be helpful after starting this course.

Duke/Coursera: Applied Data Engineering Course (Launching late 2022)

Github Repos Referenced Duke Coursera Course

Course 1: Cloud Computing Foundations

Course 2: Cloud Computing Building Blocks

Lecture Topics:

Getting Started: [Week1]

Cloud Computing Foundations: [Week2]

Virtualization and Containers: [Week3 & Week 4]

Challenges and Opportunities in Distributed Computing: [Week 5 & Week 6]

Cloud Storage [Week 7 & Week 8]

Serverless [Week 9 & Week 10]

MLOps, Big Data and Edge Computer Vision [Week 11 & Week 12 & Week 13]

General

Student Example Projects

A practical guide to Data Science, Machine Learning Engineering and Data Engineering

Read Cloud Computing for Data Book cloud4data books

Free book Developing-on-AWS-with-CSharp Screenshot 2022-10-28 at 7 12 09 AM

Next Steps: Take Coursera MLOps Course

cloud-specialization

Text and Code License

The text and code content of notebooks and documents is released under the CC-BY-NC-ND license