CS 346 (W23)
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Databases

A more scalable solution is to store data in a database: a formal system for organizing and managing data.

Databases range in size and complexity from simple file-based databases that run on your local computer, to large scalable databases that run on a server cluster. They are optimized for fetching text and numeric data, and performing set operations on this data.

They also have the advantage of scaling really well. They’re optimized not just for efficient storage and retrieval of large amounts of data, but also for concurrent access by hundreds or thousands of users simulaneously [ed. This is a huge topic and we’re not even scratching the surface. I’d strongly encourage you to take a more comprehensive database course e.g. CS 348.]

The same data can be stored in a relational database

A relational database is a particular database design that structures data as tables:

  • A table represents some logical entity e.g. Customer, Transactions. It consists of columns and rows i.e. like a grid.
  • A column is a field or data element in that table. e.g. a “Customer“ table might have “name“, “city“, “birthdate“ fields.
  • A row is a record, containing values for each field. e.g. ”Jeff Avery”, ”Waterloo”, ”June 23, 1985”. Each row in each table needs to be unique identified by a key. This can be a part of the data (e.g. timestamp) but is often a unique, generated numeric identifier (e.g. “customerid“, “transactionid“).

There are many other types of databases! We’ll use relational databases in this course b/c (a) they’re very common, and (b) they are suitable for any type of data/relations that you will encounter in this course.

Here’s our earlier example shown as a class, a set of records in a CSV file and as a set of relational database tables.

The same data in multiple representations

Why is this approach valuable? Relational databases can quickly oprerate on these tables, and support operations on sets of records. For example:

  • Return a list of all purchases greater than $100.
  • Return a list of customers from ”Paris”.
  • Delete customers from ”Ottawa”.

Our example is pretty trivial, but imagine useful queries like:

  • ”Find all transactions between 2:00 and 2:30”, or
  • ”Find our which salesperson sold the greatest amount during last Saturday’s sale”.

What is SQL?

SQL (pronounced ”Ess-que-ell”) is a Domain-Specific Language (DSL) for describing your queries. Using SQL, you write statements describing the operation to perform and which tables to use, and the database performs the operations for you.

SQL is a standard, so SQL commands work across different databases [ed. SQL was adopted as a standard by ANSI in 1986 as SQL-86, and by ISO in 1987].

Using SQL, you can:

  • C reate new records
  • R etrieve sets of existing records
  • U pdate the fields in one or more records
  • D elete one or more records1

SQL has a particular syntax for managing sets of records:

<operation> FROM [table] [WHERE [condition]]
operations: SELECT, UPDATE, INSERT, DELETE, ...
conditions: WHERE [col] <operator> <value>

For example:

/* SELECT returns data from a table, or a set of tables. 
 * an asterix (*) means "all"
 */

SELECT * FROM customers 
SELECT * FROM Customers WHERE city = "Ottawa" 
SELECT name FROM Customers WHERE custid = 1001
/* UPDATE modifies one or more column values based on some criteria.*/

UPDATE Customer SET city = "Kitchener" WHERE cust_id = 1001 
UPDATE Customer SET city = "Kitchener" // uh oh, what is wrong with this?
/* INSERT adds new records to your database.  */

INSERT INTO Customer(cust_id, name, city) 
VALUES ("1005", "Sandra Avery", "Kitchener") 

INSERT INTO Customer(cust_id, name, city)
VALUES ("1005", "Austin Avery", "Kitchener") // uh oh, what have I done?

It’s common to have a record spread across multiple tables. A join describes how to relate data across tables.

Database Joins

SELECT c.customer_id, c.first_name +   + c.last_name, t.date, p.name, p.cost 
	FROM Customer c, Transactions t, Products p
 	WHERE c.customer_id = t.customer_id
 	AND t.product_id = p.product_id

Returns data that spans multiple tables:

$ 1001, Jeff Avery, 12-Aug-2020, T-shirt, 29.95

Different types of joins that describe table and key relations

https://imgur.com/t/logic/v23nUwQ

SQLite

Relational databases are often large and complex systems. However, the relational approach can also be scaled-down.

SQLite (pronounced ESS-QUE-ELL-ITE) is a small-scale relational DBMS that supports SQL. It is small enough for local, standalone use.

SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine. SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers…"

https://www.sqlite.org/index.html

You can install the SQLite database under Mac, Windows or Linux.

  1. Visit the SQLite Download Page.
  2. Dowload the binary for your platform.
  3. To test it, launch it from a shell.
$ sqlite3

SQLite version 3.28.0 2019-04-15 14:49:49
 Enter ".help" for usage hints.
 Connected to a transient in-memory database.
 Use ".open FILENAME" to reopen on a persistent database. 

sqlite> .exit

We will create a database from the command line. Optionally, you can install SQLite Studio, a GUI for managing databases.

You can download the SQLite Sample Database and confirm that SQLite is working. We can open and read from this database using the command-line tools:

$ sqlite3 
SQLite version 3.32.3 2020-06-18 14:16:19
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

sqlite> .help
.auth ON|OFF             Show authorizer callbacks
.backup ?DB? FILE        Backup DB (default "main") to FILE
.bail on|off             Stop after hitting an error.  Default OFF
.binary on|off           Turn binary output on or off.  Default OFF
.cd DIRECTORY            Change the working directory to DIRECTORY
.changes on|off          Show number of rows changed by SQL
.check GLOB              Fail if output since .testcase does not match
.clone NEWDB             Clone data into NEWDB from the existing database
.databases               List names and files of attached databases
.dbconfig ?op? ?val?     List or change sqlite3_db_config() options
.dbinfo ?DB?             Show status information about the database

Commands

These commands are meta-commands that act on the database itself, and not the data that it contains (i.e. these are different from SQL and specific to SQLite). Some particularly useful commands:

Command Purpose
.open filename Open database filename.
.database Show all connected databases.
.log filename Write console to log filename.
.read filename Read input from filename.
.tables Show a list of tables in the open database.
.schema tablename SQL to create a particular tablename.
.fullschema SQL to create the entire database structure.
.quit Quit and close connections.

We often use these meta-commands to change settings, and find information that we’ll need to structure queries e.g. table names).

$ sqlite3
 SQLite version 3.28.0 2019-04-15 14:49:49
 Enter ".help" for usage hints.
 Connected to a transient in-memory database.
 Use ".open FILENAME" to reopen on a persistent database. 

sqlite> .open chinook.db // name of the file
sqlite> .mode column // lines up data in columns 
sqlite> .headers on // shows column names at the top 

// determine which tables to query
sqlite> .tables
albums   employees    invoices     playlists
artists   genres     media_types   tracks
customers  invoice_items  playlist_track

Queries

Once we’ve identified what we want to do, we can just execute our queries. Examples of selecting from a single table at a time:

  sqlite> SELECT * FROM albums WHERE albumid < 4;

  AlbumId     Title                     ArtistId
  ----------  ------------------------- ----------
  1           For Those About To Rock   1
  2           Restless and Wild         2
  3           Let There Be Rock         1     

  sqlite> SELECT * FROM artists WHERE ArtistId = 1;

  ArtistId    Name
  ----------  ----------
  1           AC/DC

We can also JOIN across two tables (based on a primary key, “ArtistId“’). You often will have multiple WHERE clauses to join between multiple tables.

  sqlite> SELECT albums.AlbumId, artists.Name, albums.Title
   FROM albums, artists
   WHERE albums.ArtistId = artists.ArtistId
   AND albums.AlbumId < 4;

  AlbumId     Name        Title
  ----------  ----------  ------------------------
  1           AC/DC       For Those About To Rock
  2           Accept      Restless and Wild
  3           AC/DC       Let There Be Rock

Accessing databases

How do we use a database from our applications? Kotlin doesn’t have built-in database support, but the framework includes a library for accessing external databases.

Specifically, Kotlin leverages the Java JDBC API (”Java DataBase Connectivity”). This provides a standard mechanism for connecting to databases, issuing queries, and managing results.

Creating a database project in IntelliJ:

  1. Create a Gradle/Kotlin project.
  2. Modify the build.gradle to include a dependency on JDBC.

implementation 'org.xerial:sqlite-jdbc:3.30.1'

  1. Use the Java SQL package classes to connect and fetch data.
import java.sql.Connection
import java.sql.DriverManager
import java.sql.SQLException

….

This example uses a sample database from the official SQLite tutorial.

var conn: Connection? = null 
fun connect() { 
  try {
    val url = "jdbc:sqlite:chinook.db" 
    conn = DriverManager.getConnection(url) 
    println("Connection established.") 
  } catch (e: SQLException) { 
    println(e.message) 
  } 
} 

fun query() { 
  try { 
    if (conn != null) { 
    // do something

JetBrains/Exposed

JDBC is a useful mechansism for connecting to remote databases, but making raw SQL calls is error-prone: there’s no type checking, or other safety mechanisms in-place. It also requires us to explicitly convert between string data and class objects that are holding our data.

There are a number of libraries that can leverage JDBC and provide a cleaner set of abstractions. We’ll discuss one of them: Exposed, which is a JetBrains Kotlin library for working with databases.

Dependencies

repositories {
  mavenCentral()
}
dependencies {
  implementation("org.jetbrains.exposed", "exposed-core", "0.40.1")
  implementation("org.jetbrains.exposed", "exposed-dao", "0.40.1")
  implementation("org.jetbrains.exposed", "exposed-jdbc", "0.40.1")
}

Getting Started

Every database access using Exposed is started by obtaining a connection and creating a transaction. We’ll use SQLite as an example. We can connect using the same database connection string we used in the previous section:

To get a connection:

Database.connect("jdbc:sqlite:chinook.db")

After obtaining a connection all SQL statements should be placed inside a transaction2:

transaction {
  // statements go here
}

Exposed comes in two flavors: DSL (Domain Specific Language) and DAO (Data Access Objects). DSL means a type-safe syntax that is similar to SQL; DAO means doing CRUD operations on entities like tables. They are both viable, but represent different types of abstraction for our database.

DSL Example

Here’s an example of a DSL query from the Exposed Wiki:

fun main(args: Array<String>) {
  Database.connect("jdbc:sqlite:cities.db")

  transaction {
    // print sql to std-out
    addLogger(StdOutSqlLogger)

    // create a table that reflects the Cities class structure
    SchemaUtils.create (Cities)

    // insert new city. SQL: INSERT INTO Cities (name) VALUES ('St. Petersburg')
    val stPeteId = Cities.insert {
      it[name] = "St. Petersburg"
    } get Cities.id

    // 'select *' SQL: SELECT Cities.id, Cities.name FROM Cities
    println("Cities: ${Cities.selectAll()}")
  }
}

object Cities: IntIdTable() {
    val name = varchar("name", 50)
}

DAO Example

Here’s the same example using Exposed DAO. This is similar to DSL, but the objects are more explicitly tied to the database structure.

fun main(args: Array<String>) {
  Database.connect("sqlite:cities.db")

  transaction {
    // print sql to std-out
    addLogger(StdOutSqlLogger)

    SchemaUtils.create (Cities)

    // insert new city. SQL: INSERT INTO Cities (name) VALUES ('St. Petersburg')
    val stPete = City.new {
            name = "St. Petersburg"
    }

    // 'select *' SQL: SELECT Cities.id, Cities.name FROM Cities
    println("Cities: ${City.all()}")
  }
}

object Cities: IntIdTable() {
    val name = varchar("name", 50)
}

class City(id: EntityID<Int>) : IntEntity(id) {
    companion object : IntEntityClass<City>(Cities)
    var name by Cities.name
}

For details on how to use the Exposed DAO API, see the DAO Example page.


  1. These are considered the set of basic functions for working with persistant storage. Yes, we use the acronym CRUD. You’ll occasionally trip over the term “CRUD applications” which just means an application whose primary purpose is to interact with an underlying database. An example would be something like a simple app for tracking a list of customers or sales data. ↩︎

  2. A transaction represent a scope of operations that should be done together, in the order presented. If any one of the operations in the transaction fails, none of the operations will be applied. ↩︎