-
Notifications
You must be signed in to change notification settings - Fork 1
Using SQL within R
For SQL within R, we usually need two packages:
DBI: R Database Interface
- Specific package for the individual SQL implementation (MySQL, SQLite, Oracle, etc.).
- Package
RSQLitefor SQLite
DBI contains various virtual classes and functions in connecting and querying a database:
dbDriver Driver specifying the operations for creating connections to SQL Servers
For SQLite:
m = dbDriver("RSQLite") # equivalent to SQLite()For MySQL:
m = dbDriver("MySQL") # equivalent to MySQL()dbConnect Connect to a DBMS.
- For SQLite:
conn = dbConnect(m, "path_to_file")- For MySQL
conn = dbConnect(m, user="user_name", db="database_name", host="SQL_server", password="password")dbDisconnect Disconnect from a DBMS. You should always disconnect after you no longer need it
dbDisconnect(conn)dbSendQuery Submits and executes SQL statement (information retrieved using fetch)
query = dbSendQuery(conn, "SELECT statement;")fetch Get records from a dbSendQuery
res = fetch(query)dbClearResult Free all resources (local and remote) from a query.
dbClearResult(query)dbGetQuery Submits, executes SQL statement and retrieves information
res = dbGetQuery(conn, "SELECT statement;")dbListTables List tables in database connection
tables = dbListTables(conn)dbGetInfo Get meta-data for DBIObjects
meta.data = dbGetInfo(query)dbReadTable Fetch the data from a table.
res = dbReadTable(conn, "table_name")dbListFields Return the column names for a given table
columns = dbListFields(conn, "table_name")library(RSQLite)## Loading required package: DBIm = dbDriver("SQLite")
con = dbConnect(m, "/Users/darrellaucoin/GitHub/IntroSQL/stats_club.db")
query = dbSendQuery(con,
"SELECT *
FROM exec;")
result = fetch(query)
result## stud_num name questid position
## 1 20434557 Darrell Aucoin darrell President
## 2 20334532 Judith B. Gibbons judith Events
## 3 20345524 Lamar Roy lamar Finance
## 4 20565526 Gilberto Cross gilberto Events
## 5 20344527 Melba Lane melba President
## 6 20554329 Ruben Lamb ruben Technology
## 7 20893434 Hannah Mason hannah SeniorAdvisor
## 8 20343532 Patrick Robertson patrick Events
## 9 20334321 Dominick Byrd dominick Events
## email phone key_fob
## 1 darrell.aucoin@gmail.com 519-555-1424 Y
## 2 judith@uwaterloo.ca 519-555-2343 N
## 3 lamar@uwaterloo.ca 519-555-3432 Y
## 4 gilberto@uwaterloo.ca 519-555-3453 Y
## 5 melba@uwaterloo.ca 519-555-2322 Y
## 6 ruben@uwaterloo.ca 519-555-5232 Y
## 7 hannah@uwaterloo.ca 519-555-2342 N
## 8 patrick@uwaterloo.ca 519-555-2312 N
## 9 dominick@uwaterloo.ca 519-555-2325 NdbClearResult(query)## [1] TRUEdbDisconnect(con)## [1] TRUElibrary(RSQLite)
m = dbDriver("SQLite")
con = dbConnect(m, "/Users/darrellaucoin/GitHub/IntroSQL/stats_club.db")
result = dbGetQuery(con,
"SELECT name, start_time, location
FROM event;")
result## name start_time location
## 1 BOT 2015-01-28 19:00:00 C & D
## 2 EOT <NA> <NA>
## 3 Intro to Hadoop <NA> <NA>
## 4 Intro to SQL 2015-02-05 18:00:00 MC-3003
## 5 Prof Talk <NA> <NA>
## 6 Prof Talk 2 <NA> <NA>
## 7 Prof Talk 3 <NA> <NA>
## 8 Intro to SQL: Basic Queries 2015-03-09 18:00:00 MC-3003
## 9 Intro to SQL: Advanced Queries 2015-03-12 18:00:00 MC-3003dbDisconnect(con)## [1] TRUEI transferred the data in stats_club.db to MySQL on my laptop.
library("RMySQL") ##
## Attaching package: 'RMySQL'
##
## The following objects are masked from 'package:RSQLite':
##
## dbBuildTableDefinition, isIdCurrentm = dbDriver("MySQL")
conn = dbConnect(m, user="darrell", db="stats_club",
host="localhost",
password="pass")
query = dbSendQuery(conn,
"SELECT name, start_time, location
FROM event")
result = fetch(query)
dbClearResult(query)## [1] TRUEprint(result)## name start_time location
## 1 BOT 2015-01-28 19:00:00 C & D
## 2 EOT <NA> <NA>
## 3 Intro to Hadoop <NA> <NA>
## 4 Intro to SQL 2015-02-05 18:00:00 MC-3003
## 5 Intro to SQL 2 <NA> <NA>
## 6 Prof Talk <NA> <NA>
## 7 Prof Talk 2 <NA> <NA>
## 8 Prof Talk 3 <NA> <NA>dbDisconnect(conn)## [1] TRUECommon SQL Commands
Using SQL within R
Quiz Solutions
Motivation
Different Implementations of SQL
##Theory
- Relational Databases
- The E-R Model
- E-R Relationships
- Normalization
- 1st Normalization
- 2nd Normalization
- 3rd Normalization
- Primary and Foreign Keys
- Relational Algebra Operations
- Constraints
##How to Query in SQL
- Data Types
- Projection (SELECT Clause)
- SQL Functions
- DISTINCT Prefix
- Aliases
- Selection (WHERE Clause)
- Predicate Operators
- LIKE Predicate
- ALL, ANY / SOME Operator Modifiers
- AND, OR Operators
- GROUP BY Clause (Aggregation)
- HAVING Clause (Filtering using Aggregation)
- Aggregate Functions
- GROUP BY with ROLLUP/CUBE
- Joining Tables
- CROSS JOIN
-
INNER JOIN
1. ON Clause - OUTER JOIN
- NATURAL JOIN
- Subqueries
- Non-Correlated Subqueries
- Correlated Subqueries
- Set Operations (Vertically Joining Tables)
- UNION
- INTERSECT
- Difference (EXCEPT)
- WITH Clause
- CASE Expressions
- ORDER BY Clause
- LIMIT Clause