Introduction to Data Science

Lectures: Friday 09:40-12:30 (Computer Lab)

Course prerequisites: IS 100, ECON 206

Course credits: 3

Course description

Data science is an interdisciplinary field about scientific processes and systems to extract knowledge or insights from data in various forms. With the availability of substantial amount of data in various forms and resources, it has become essential for economists to be equipped with skills needed to collect, process, analyze, and present the data. The course will be taught as a series of workshops. Main topics and methods will be summarized and discussed in each lecture, and the students will write the code to perform the task assigned to them during the lecture. The students will learn how to write basic programs in R which is one of the most popular open-source programming language currently in use by data scientists.

Course objectives

By the end of the course the students will know how they use the data science for economic analysis, and learn the basic tools that they need for data analysis. At the end of the course the students apply these tools and techniques to analyze a real-world problem by using R in all stages of the research process.

Learning outcomes

Thus the students at the end of the semester will be able to:

Grading

The course consists of lectures, quizzes, homeworks and projects.

Course grades will be based on 6 quizzes (10 pts each), 1 project (40 pts), and forum participation (as a bonus, up to 10 pts).

There will be 7 quizzes in total, and you can take any 6 of them. There will be no make-up.

The project teams will consist of 3 students. Projects will be presented on-line on January 23, 2024, and be submitted by midnight, the same day.

DataCamp support

“This class is supported by DataCamp, the most intuitive learning platform for data science and analytics. Learn any time, anywhere and become an expert in R, Python, SQL, and more. DataCamp’s learn-by-doing methodology combines short expert videos and hands-on-the-keyboard exercises to help learners retain knowledge. DataCamp offers 350+ courses by expert instructors on topics such as importing data, data visualization, and machine learning. They’re constantly expanding their curriculum to keep up with the latest technology trends and to provide the best learning experience for all skill levels. Join over 6 million learners around the world and close your skills gap.”

Information on (free) registration to DataCamp will be provided in due course.

Textbooks

To do list for the second week

In case of trouble

If you get any error message anytime while using R, please first search for the error in Google (just copy the error message to the Google search bar), and try to find an answer (among the search results, first check Stackoverflow sites).

If you cannot solve the problem in a reasonable time, submit a question at the Forum page of ODTUClass. When you submit your question, please add the error message and provide sufficient info to reproduce the error.

Note that you will make errors frequently when you start using R, especially when you write your own code. Most of these errors will be due to missing parentheses and commas. Check the code first.

Do not get frustrated when you get error messages. It is an essential part of the learning process. Therefore, try to fix these errors by yourself.

Presentations

Part 1. Basics

Part 2. Applications