- 0. Why SQL?
- 1. Database Fundamentals
- 2. Core SQL Commands
- 3. Additional Commands
- 4. NULL Handling & Conditional Logic
- 5. Aggregation Concepts
- 6. Intermediate Query Structures
- 7. SQL JOINS
- 8. Date Time Feature Extraction
- Used to query large datasets efficiently (Excel breaks at scale)
- Used across teams: analytics, engineering, product, web
- Works with databases managed by a DBMS (Database Management System)
- Databases live on servers (persistent + accessible)
A system that: - Stores data - Organizes data - Executes queries efficiently - Manages permissions & performance
- Structured as rows and columns
- Tables linked via Primary Keys and Foreign Keys
- Example systems: PostgreSQL, MySQL, SQL Server
- Key-Value -- dictionary-like structure\
- Column-based -- data grouped by columns\
- Graph-based -- focuses on relationships between entities\
- Document-based -- stores flexible JSON-like documents (e.g., MongoDB)
SELECT column1, column2
FROM table_name
WHERE condition
GROUP BY column
HAVING condition
ORDER BY column ASC;SELECT-- choose columnsFROM-- specify tableWHERE-- filter rowsGROUP BY-- aggregate rowsHAVING-- filter aggregated resultsORDER BY-- sort results
FROMWHEREGROUP BYHAVINGSELECTDISTINCTORDER BYTOP / LIMIT
Note: Columns made using alternate methods like SELECT SUM(COL) as COL1 are not recognized by GROUP BY.
↑ Back to top
SELECT DISTINCTTOP/LIMITINSERT INTOVALUESCREATE TABLEALTER TABLEDROP TABLEUPDATEDELETETRUNCATE- Logical operators:
AND,OR,NOT - Pattern matching:
LIKE
Note: When inserting into specific columns, ensure VALUES match the selected columns.
IFNULL(value, replacement)-- replaces NULL (2 arguments only)COALESCE(val1, val2, val3, ...)-- returns first non-null value
Multiply by 1.0 to force float division if needed.
Used for conditional logic inside queries.
CASE
WHEN condition THEN result
ELSE result
END- Used to classify data
- Often used inside
SELECT - Can be used inside aggregates (conditional aggregation)
Common aggregate functions:
COUNT()SUM()AVG()MIN()MAX()
SUM(CASE WHEN condition THEN column ELSE 0 END)Used to segment metrics in a single query.
Used to create intermediate result sets.
WITH intermediate_table AS (
SELECT ...
FROM ...
)
SELECT *
FROM intermediate_table;- Improves readability
- Breaks complex problems into steps
- Exists only for that query
- INNER JOIN → Returns only rows where there is a match in both tables (intersection of two tables).
- LEFT JOIN (LEFT OUTER JOIN) → Returns all rows from the left
table, and matching rows from the right table. Non-matches on the
right become
NULL. - RIGHT JOIN (RIGHT OUTER JOIN) → Returns all rows from the right
table, and matching rows from the left table. Non-matches on the
left become
NULL. - FULL OUTER JOIN → Returns all rows from both tables.
Non-matching rows from either side contain
NULLvalues. - Join behavior depends on where filtering conditions are placed
(
ONvsWHERE), especially with outer joins.
SELECT *
FROM trades
JOIN users
ON trades.user_id = users.user_id;SELECT users.city,
COUNT(trades.order_id) AS total_orders
FROM trades
INNER JOIN users
ON trades.user_id = users.user_id
AND trades.status = 'Completed'
GROUP BY users.city
ORDER BY total_orders DESC
LIMIT 3;SELECT pages.page_id
FROM pages
LEFT JOIN page_likes
ON pages.page_id = page_likes.page_id
WHERE page_likes.liked_date IS NULL
ORDER BY page_id;EXTRACT(YEAR FROM date_column)used for getting specific parts of dates. You can replace YEAR with DAY or MONTH as well.DATE_TRUNCrounds the date down to a specific unit i.e.DATE_TRUNC('month', sent_date) AS truncated_to_monthINTERVALyou can modify date strings easily by usingINTERVAL. For examplesent_date + INTERVAL '2 days'adds 2 days of time in the given timestampTO_CHARis used to reformat dates into specific formats. For instance -TO_CHAR(sent_date, 'YYYY-MM-DD HH:MI:SS') AS formatted_iso8601::DATE or TO_DATE()converts strings into dates.::TIMESTAMPorTO_TIMESTAMP()converts strings into timestamps.- You can also use
MIN()andMAX()with dates clubbed with the functions above to solve certain problems.
For example:
SELECT user_id,
EXTRACT(DAY FROM (MAX(post_date) - MIN(post_date))) as days_between
FROM posts
WHERE EXTRACT(YEAR FROM post_date) = 2021
GROUP BY user_id
HAVING COUNT(user_id) >= 2