3  Labs: Minimal Technical Skills Required

3.1 Lab: GitHub Codespaces

3.1.1 Objective

  • Use GitHub Codespaces, VS Code, and cloud runtimes.
  • Understanding Terminal, Command Line Interface (CLI), and user interface.

3.1.2 Task

TipTask: Get Started with GitHub Codespaces

Follow the steps in the document below to learn GitHub Codespaces basics and get comfortable with the professional development environment.

After completing the lab, answer these questions:

3.2 Lab: Basic Operating System & Terminal Literacy

  • Navigate files and folders (Linux basics)
  • Use terminal commands to start, stop, and rerun applications

3.2.1 Command Line and Terminal

Before learning a full programming language like Python, many computer science students first encounter the terminal and command line. In practice, the command line acts as a student’s first scripting language: it teaches how to run programs, navigate files, pass arguments, and control how code executes.

While modern notebook environments (e.g., Google Colab or Jupter Notebooks) make it easy to start Python coding without setup, they often hide this foundational layer. As a result, students may write Python without ever learning how real programs are stored, executed, and managed using .py files and the terminal—skills that are essential for real-world development, servers, and production systems.

Warning⚠️ Importance of Command Line and Terminal

If you skip learning command line skills or avoid the terminal, you’ll struggle to work on real-world projects, collaborate effectively with teams, or operate in servers or cloud platforms — where graphical interfaces aren’t available. The terminal isn’t just a tool for experts; it’s the foundation for professional workflows in data science and engineering.

So, before diving deeper into advanced business data workflows, we’ll start by filling this gap, and learn the command lines and terminal to navigate files, run Python scripts, and operate in professional computing environments.

What You’ll Learn Next

  • Main operating systems: Windows, macOS, Linux/Unix
  • GUI vs terminal and why terminals matter
  • Bash shell basics for programming and data science

3.2.2 Operating Systems Overview

Most students in this course use Windows, which dominates personal computers with roughly 70–75% of the global desktop market. macOS holds about 15–20%, while Linux and others make up a small share of personal use.

In contrast, the enterprise, cloud, AI/ML, and high-performance computing (HPC) worlds are very different. Linux and other Unix-like systems are the backbone in web servers, cloud computing and supercomputers, making up nearly half of cloud workloads and being the OS for all top 500 supercomputers. Popular Unix and Linux systems include:

  • Ubuntu
  • Debian
  • Fedora
  • Red Hat Enterprise Linux (RHEL)
NotemacOS is Unix-based

Although macOS looks different, it is actually Unix-based, meaning the terminal commands and Bash shell you’ll learn in this course work much the same on both macOS and Linux.

  • Servers Linux holds a 62.7% market share for server operating systems.
    • Web servers: 77–88% of public web servers run on Linux or other Unix-like systems. It is the most used operating system for web servers globally.
  • Cloud computing Cloud workloads are heavily dependent on Linux-based operating systems. As of mid-2025, Linux powers 49.2% of all global cloud workloads.
  • Supercomputers Linux has a complete monopoly in the supercomputing sector. 100% market share: Since 2017, 100% of the world’s top 500 supercomputers have run on Linux.
  • AI and ML workloads Linux is the clear leader for AI and ML projects and infrastructure. In mid-2025, 87.8% of machine learning workloads ran on Linux infrastructure. Large ML and data science deployments predominantly run on Linux-based or Unix-based servers.
    • Cloud environments: Cloud providers like AWS, Google Cloud (GCP), and Microsoft Azure primarily offer Linux-based instances for running AI and ML tasks.

Source: Wikipedia - Usage share of operating systems Azure Official Page, Microsoft Tech Community Update (Feb 2025)

3.2.3 What Operating System does GitHub Codespaces use?

Open the Terminal inside your GitHub Codespace (View → Terminal) and check the OS with the commands below.

# Bash
# Display info about the operating system
cat /etc/*-release

# Display the Linux kernel version and build info
cat /proc/version

You should see an Ubuntu-based Linux release because GitHub Codespaces runs inside a Linux container.

3.2.4 What Is a Terminal?

A terminal (also called a command line or shell) is a text-based interface that lets you interact directly with your computer by typing commands.

Before graphical interfaces (with windows, icons, and a mouse) were invented, the terminal was the primary way users operated computers — to run programs, manage files, and control hardware.

EVERY operating system includes a terminal app:

  • Windows:
    • Command Prompt(cmd)
    • PowerShell
    • or Bash (through Windows Subsystem for Linux)
    • Linux: Bash is the default shell on most Linux systems
  • macOS: Zsh in Terminal app (based on Unix) is the default terminal in macOS.
NoteBash vs Zsh
  • Both Bash and Zsh are terminals that interpret your commands, and they work almost the same.

The terminal can do almost everything you normally do with a mouse:

  • Navigate files and folders
  • Run programs or scripts
  • Install and manage software
  • Connect to remote servers
  • Automate repetitive tasks with shell scripts

Data scientists and developers rely on the terminal for its speed and automation, especially when working in cloud environments like GitHub Codespaces or on Linux servers.

3.2.5 Why Learn Bash commands and Terminal?

First, data science projects often run on servers or cloud environments, not personal laptops which lack the computational power for large-scale training, data processing, or deployment.

These servers — such as AWS EC2, Azure VMs, or Google Cloud Compute instances — usually run Linux or Unix systems and don’t include a graphical user interface (GUI) by default. — they are managed entirely through the command line interface (CLI). To interact with them efficiently, you use Bash, a powerful and widely used command-line shell.

TipWhat is a GUI?

A Graphical User Interface (UI) is the visual part of your computer — windows, buttons, and menus you click with the mouse. However, Linux servers don’t usually have this kind of visual interface.
Instead, users interact with them through script commands typed into a terminal such as bash.

3.2.6 GUI, CLI, Terminal and Desktop

  • GUI (Graphical User Interface) – The visual interface you use with a mouse, icons, and windows, such as Windows desktop, macOS Finder. GUIs are user-friendly but less efficient for automation or remote access.

  • CLI (Command Line Interface) – A text-based interface where you type commands instead of clicking.

  • Terminal – The program that provides access to the CLI. It’s like a window that lets you type commands and see text output, such as Windows PowerShell, macOS Terminal, Linux bash Terminal.

  • Desktop Environment – The collection of GUI components that make up the user’s graphical workspace — including the taskbar, file explorer, and app windows; such as Windows Desktop, macOS.

NoteSummary
  • The Terminal gives you access to the CLI, while the Desktop Environment provides a GUI.
  • Both let you control the same computer — one through text, the other through graphics.

3.2.7 Learning Bash commands in GitHub Codespaces

Mastering Bash is essential. It enables you to write scripts, manage jobs, and execute commands directly on compute servers — a critical skill when working with large datasets or LLM pipelines.

3.2.8 Bash in VS Codespaces

Restart your GitHub Codespaces.

Make sure you can see the Terminal panel. If you accidently closed your terminal, you can always start one (or many) following the steps below.

You also can start a second terminal via the Terminal panel.

We are going to learn basic bash commands to:

  • Navigate and manage files
  • Run Python (.py) scripts directly from the command line
  • Work efficiently within server-based or local terminal environments

Use your GitHub Codespace terminal in VS Code to practice these commands. If you close the terminal, reopen it via View → Terminal or the Ctrl+` shortcut.

3.2.9 Lab: Linux and bash

  • Display info about the operating system.
cat /etc/*-release
  • Display the Linux kernel version and build info.
cat /proc/version

3.2.10 Lab: Paths, Folders, Directories (pwd)

  • Print your current working directory (the folder you are “in”). A directory is a folder, directory and folder are the same thing.
pwd
#/workspaces/codespaces-jupyter

Please type pwd 5 times and each time say “print working directory”.

When to use pwd? if you lost in folders and don’t know where you are in the directories or folders, pwd will tell you where you are.

3.2.11 Directory Structure in GitHub Codespace Terminal

/
├── bin/
├── boot/
├── dev/
├── etc/
├── home/
│   └── codespace/          ← your user home directory if you do `cd ~`
├── lib/
├── lib64/
├── media/
├── mnt/
├── opt/
├── proc/
├── root/
├── run/
├── sbin/
├── srv/
├── sys/
├── tmp/
├── usr/
└── workspaces/
    └── codespaces-jupyter     ← your GitHub repo (default working dir)

3.2.12 Lab: List Directory (ls)

The ls command is used to list files and folders in a directory.
Here are some of the most commonly used ones with options (such as -a, -l)

# List files and folders in the current directory
ls

# List **all** files, including hidden ones (those starting with .)
ls -a

# List files in a detailed (**long**) format — shows permissions, owner, size, and date
ls -l

# Combine options: show all files in detailed view
ls -la

# Sort files by modification **time** (newest first)
ls -lt

3.2.13 Lab: Change Directory (cd)

  • cd data: go the data folder under the current directory (create it first if it doesn’t exist).
  • cd ..: go the parent folder.
  • cd ~: go to the home folder. In Codespaces, the home folder is /home/codespace. If you are lost in a directory and want to start over from a safe directory – your home. You can type cd ~, and you will be taken to the home directory.
# go into a data folder under your repo root
ls
cd data
ls
# see the "altantis.csv"

# Move up one folder (to the parent directory /workspaces/codespaces-jupyter)
cd ..

# Go back to your "home" folder (/home/codespace in Codespaces)
cd ~
pwd

# To-do: find a way to go back to the: workspaces/codespaces-jupyter. 
# If you failed, simply start a new terminal. 

3.2.14 Lab: Make A Directory (mkdir)

# From your repo root, create a new folder named "data"
cd /workspaces/codespaces-jupyter
mkdir mydata

# Make multiple folders at once
mkdir project results logs

# Check that they were created
ls

3.2.15 Lab: Clear the Screen (clear)

# Clear the terminal screen
clear

3.2.16 Lab: Remove Directory (rmdir)

# Create an empty folder named "temp_folder"
mkdir temp_folder

# Remove the empty folder
rmdir temp_folder

# Create multiple empty folders and remove them
mkdir folder1 folder2
rmdir folder1 folder2

3.2.17 Lab: Making Empty Files (touch)

# Create an empty file named "notes.txt"
touch notes.txt

# Create multiple files at once
touch a.txt b.txt c.txt

# Verify files were created
ls

3.2.18 Lab: Copy a File (cp)

# Copy a file to a new file
cp notes.txt notes_backup.txt

# Create a folder to copy into
mkdir backup

# Copy a file into a different folder
cp notes.txt backup/

# Check the results
ls backup

3.2.19 Lab: Moving/Rename a File (mv)

# Move a file into a different folder
mv notes_backup.txt backup/

# Rename a file
mv notes.txt todo.txt

# Verify the changes
ls

3.2.20 Lab: Stream a File (cat)

# Display the contents of a file
cat todo.txt

# To-do: Display the README.md file in your repository:


# Display a system file (try this!)
cat /etc/*-release

3.2.21 Lab: Removing a File (rm)

# Create some temporary files first
touch old.txt temp.txt sample.txt

# Remove a single file
rm old.txt

# Remove multiple files
rm temp.txt sample.txt

# Remove an entire folder and its contents (be careful!)
rm -r backup

3.2.22 Lab: Exiting Your Terminal (exit)

# Exit the current terminal session
exit

3.2.23 Summary Table – Common Bash Commands

Command Purpose Example
pwd Print working directory pwd
ls List files and folders ls -la
cd Change directory cd /workspaces/codespaces-jupyter
mkdir Make a new directory mkdir data
rmdir Remove an empty directory rmdir temp_folder
touch Create an empty file touch notes.txt
cp Copy a file cp notes.txt backup/
mv Move or rename a file mv old.txt new.txt
cat View contents of a file cat notes.txt
rm Remove a file or folder rm -r foldername
clear Clear the screen clear
exit Exit the terminal exit

3.2.24 Lab: create the scripts folder

Using the bash terminal to create a scripts folder under GitHub default working dir /workspaces/codespaces-jupyter, and create an empty my_script.py file in the scripts folder.

TipTip

You may find the commands pwd, mkdir, ls, cd, and touch helpful for completing this exercise.

Your task is to:

  • Verify your current working directory.
  • Create a new folder called scripts inside the project root.
  • List the directory contents to confirm that scripts was created successfully.
  • Navigate into the scripts folder.
  • Confirm it is empty.
  • Create a new Python file named my_script.py inside the scripts folder.

Write the Bash commands needed to accomplish each step.

Once you complete the task, your explorer should look like below:

@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $ pwd
/workspaces/codespaces-jupyter
@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $ mkdir scripts
@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $ ls
README.md  _quarto.yml  bash.qmd  chapter0.qmd  chapter1.qmd  chapter2.qmd  chapter3.qmd  docs  images  index.qmd  references.bib  data  project  results  logs  scripts
@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $ cd scripts
@tianyuan09 ➜ /workspaces/codespaces-jupyter/scripts (main) $ ls
@tianyuan09 ➜ /workspaces/codespaces-jupyter/scripts (main) $ touch my_script.py

3.3 Lab: Python Script Files Basics

NoteObjectives
  • What is a python script file?
  • Know the difference between .py files and Jupyter notebooks .ipynb.

Python is a high-level, general-purpose programming language used for data science, AI/ML, automation, and software development.

  • Two common Python file types you will use:
    • .py script files — plain-text Python source you run from the terminal (e.g., python hello.py) or import as modules.
    • .ipynb Jupyter notebooks — interactive notebooks that mix code, text, and visuals; executed cell by cell inside VS Code or Jupyter.

3.3.1 .ipynb Notebook vs .py Script Files

Figure 3.1 illustrates the difference between running Python code in .ipynb notebook versus in `.py’ script file.

Figure 3.1: Comparison of Jupyter Notebook and Python Script formats

Understanding the Terminal

See directory structure in codespaces in the previous chapter.

In the terminal, it always starts with:

@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $.

@tianyuan09 ➜ /workspaces/codespaces-jupyter (main) $
           │                 │                   │
           │                 │                   └─ Prompt symbol ($): 
           │                 │                      shows the terminal is ready for input
           │                 │
           │                 └─ Current working directory:
           │                    you’re inside the folder “codespaces-jupyter”
           │                    located under “/workspaces”

           └─ Arrow (): 
               just a decorative separator in the prompt

└─ Username (and sometimes host): 
   “tianyuan09” — the current user logged into this environment

Difference between .ipynb Notebook and .py Script File.

Feature .ipynb (Jupyter Notebook) .py (Python Script)
Structure Structured JSON format combining code, text cells in Markdown, and outputs. Plain text file containing only Python code and comments #.
Execution Run one cell at a time, showing output immediately below each cell. Executed all at once using a command like python my_script.py, seen ④ in Figure 3.1.
Use Case Ideal for data analysis, visualization, and teaching due to its interactive nature. Better for automation, deployment of production-ready code.

3.3.2 Lab: Create and run a .py script file

First, let’s create a hello.py under the scripts folder.

# file path: scripts/hello.py
print("Hello from Python!")
print("This script is running from the terminal.")

# Get current date and time
import datetime
now = datetime.datetime.now()
print(f"Current time: {now}")

You can directly use explorer or use bash command Figure 3.2.

Figure 3.2: Create and run your 1st script file

Expected output:

Hello from Python!
This script is running from the terminal.
Current time: 2026-01-13 10:30:45.123456

3.3.3 Lab: Checking Your Python Version

# Check Python version
python --version
python3 --version

# Check which Python executable you're using
which python
which python3

3.3.4 Lab: Different Python Commands

Depending on your system setup, you might need to use different commands:

# On systems with Python 3 as default
python hello.py

# OR
python3 hello.py

# OR Using specific Python version
python3.9 hello.py
python3.12 hello.py

3.4 Running Scripts in Different Directories.

3.4.1 Absolute Paths

# Run script from anywhere using absolute path
python /Users/username/projects/my_script.py

/Users/username/projects/my_script.py is a absolute file path.

Absolute paths:

  • On macOS/Linux, starts with /.
  • On Windows, starts with a drive letter like C:/.
  • In short, it always starts from the root of the filesystem (the top level).

You can find either the relative or aboslute file paths in codespaces:

3.4.2 Relative Paths

# Run script in current directory
python ./script.py

# Run script in subdirectory
python scripts/data_analysis.py

# Run script in parent directory
python ../utilities/helper.py

The paths above (e.g. ../utilities/helper.py) are relative paths.

  • They don’t start with / or a drive letter (e.g. C:/).
  • They may include . (current folder) or .. (parent folder).
  • They starts from your current working directory (pwd).

3.4.3 Difference in absolute vs relative paths

Starts With Type Meaning
/ (Linux/Mac) Absolute Starts at root of file system
C:\ (Windows) Absolute Starts at root of drive
. or .. Relative Based on current working directory
No / or C:\ Relative Implied to start from current folder