Notes taken during Lecture

Below you find the notes taken in class. I am also using hand written notes that you can find here: https://www.evernote.com/l/AIOM1BQDeRBPzJzmVaJsngnE4-8CYA48p7g

<2022-01-12 Wed>

design a person and address schema

SQL

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


-- option 1
CREATE TABLE person (
  fname,
  lname,
  adress1,
  adress2,
  city,
  state,
  zip
  );

-- option 2
CREATE TABLE person (
  name,
  ssn,
  PRIMARY KEY(ssn),
  );

CREATE TABLE address (
  city,
  zip,
  street,
  PRIMARY KEY(city,zip,street),
);

CREATE TABLE livesat (
  ssn,
  city,
  zip,
  street
  );
-- option 3
CREATE TABLE person (
  fname,
  lname,
  zip
  );

Person

Name	SSN	zip	city
Peter	111	60616	Chicago
Peter	111	60657	Chicago
Alice	222	60615	Chicago
Bob	333	60657	Chicago

JSON

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66


[
{
    "Person":
    {
        "name": "Peter",
        "SSN": 111
        "Zip": 60614
    }
},
{
    "Person":
    {
        "name": "Peter",
        "SSN": 111
        "Zip": 60657
    }
}
]

// option 2
{
    "Name": "Peter",
	"SSN" : "111"
}

// option 3
[
    {
    "name" : "Peter" ,
    "SSN" : "111",
    "ZIP" : ["60616","60657"],
    "City" : "Chicago"
}

    {
    "name" : "Peter" ,
    "SSN" : "111",
    "ZIP" : ["60616","60657", "60645"],
    "City" : ["Chicago", "Chicago", "New York"]
    }


]

// option 4
{
    "name": "Peter",
    "ssn": "111",
    "addresses": [
        {"zip": "60616", "city": "Chicago"},
        {"zip": "60657", "city": "Chicago"}
    ]
}

// option 4
{
    "persons": [
        {
        "name": "Peter",
        "ssn": "111"
    }],
    "addresses": [
        {"zip": "60616", "city": "Chicago", "ssn": "111"},
        {"zip": "60657", "city": "Chicago", "ssn": "111"}
    ]
}

FO logic

signature

animal(X) and plan(X) and bird(X) and canfly(X) and canswim(X)

model

$\mathbb{D} = \{ penguin, swallow, cow, human, tree \}$
$bird =\{ (swallow), (penguin) \}$
$animal = \{ (cow), (swallow), (penguin), (human) \}$
$plant = \{ (tree) \}$
$canfly = \{ (swallow) \}$
$canswin = \{ (penguin) \}$
$equals = \{ (swallow, swallow), (penguin, penguin), \ldots$

formulas

Can Swallows fly?

$$canfly(swallow)$$ $$\forall s: swallow(s) \rightarrow canfly(s)$$ $$\forall x: equals(x,swallow) \rightarrow canfly(x)$$

All birds can fly

$$\forall x: canfly(x)$$ - this does not work, it requires that everything can fly not just birds

$$\forall x: bird(x) \rightarrow canfly(x)$$

No plant can fly

$$\forall x: plant(x) \rightarrow \neg canfly(x)$$

<2022-01-19 Wed>

<2022-01-26 Wed>

1
2
3


SELECT A
FROM R,S
WHERE A = B;

1
2


SELECT A
FROM R JOIN S ON (A=B);

a -> b -> c

a - b - c

1
2


Q(X) :- R(X,Y).
Q'(A) :- R(A,SOMETHING).

Variable mapping Q to Q': $\Psi_1$ = { X -> SOMETHING, Y -> A }

mapping head to head: { X -> A, Y -> A }
body contained in body: $\Psi_1(R(X,Y))$ = R(A,A)
mapping head to head: { X -> A, Y -> SOMETHING }
body cantained in body: $\Psi_2(R(X,Y))$ = R(A,SOMETHING) Q1() :- R(a,b), R(c,b). Q2() :- R(x,y).
Containment mapping: x-> a, y -> b
- database D { R(1,1) }
a -> 1, b -> 1, c -> 1 x -> 1, y -> 1

Q1() :- R(1,1), R(1,1).

Q1() :- R(a,b), R(c,b).

Q2() :- R(x,y).

<2022-01-31 Mon>

<2022-02-02 Wed>

Does DB fulfill constraint

R: A -> B

1
2


Q() :- R(X,Y), R(Z,A), X=Z, Y != A.
QFULL() :- not Q().

this does not work:

1

Q() :- R(X,Y), R(Z,A), X=Z, Y=A.

extend the example

1

Q(X,Y,Z,A,B,C) :- R(X,Y,Z), R(A,B,C), X=A, Y<B.

<2022-02-07 Mon>

<2022-02-09 Wed>

<2022-02-14 Mon>

1
2
3
4
5
6
7


SELECT p.Name AS Name,
       a.City AS Address,
       a.Office-contact AS Office-phone
--         ? AS Office-address,
--         ? AS Home-phone
FROM Person p, Address a
WHERE p.Adress = a.Id

1
2
3
4
5
6


CREATE VIEW personTarget AS (
SELECT p.Name AS Name,
       a.City AS Address,
       a.Office-contact AS Office-phone
FROM Person p, Address a
WHERE p.Address = a.Id);

1

SELECT Name FROM personTarget;

1
2
3
4
5


SELECT Name FROM (SELECT p.Name AS Name,
       a.City AS Address,
       a.Office-contact AS Office-phone
FROM Person p, Address a
WHERE p.Address = a.Id) AS persontTarget;

<2022-02-16 Wed>

<2022-02-23 Wed>

<2022-02-28 Mon>

1
2
3
4
5
6


WITH numst AS (SELECT count(*) as nums, dept FROM student GROUP BY dept),
     topd AS (SELECT nums, dept
                FROM numst
               ORDER BY nums DESC
               LIMIT 1)
SELECT * FROM topd, topd;

1
2
3
4
5
6
7
8
9


WITH numst AS (SELECT count(*) as nums, dept FROM student GROUP BY dept),
SELECT * FROM (SELECT nums, dept
                FROM numst
               ORDER BY nums DESC
               LIMIT 1),
              (SELECT nums, dept
                FROM numst
               ORDER BY nums DESC
               LIMIT 1);

<2022-03-02 Wed>

<2022-03-21 Mon>

Question 1.1

1
2


Q(Lastname,GPA) :- student(_,_,Lastname,Major,GPA), Major = cs.
Q(Lastname,GPA) :- student(_,_,Lastname,cs,GPA).

Question 1.2

1
2


Q(F,L) :- student(S,F,L,_,_), interest(S,hacking).
Q(F,L) :- student(S,F,L,_,_), interest(S,surfing).

Question 1.3

1
2


QnotHack(S) :- interest(S,A), A <> hacking.
Q(S) :- interest(S,hacking), not QnotHack(S).

Question 1.4

1
2


Qso(S,L,O) :- student(S,_,L,_,_), interest(S,A), event(_,O,A,_).
Q(S1,L1,S2,L2) :- Qso(S1,L1,O), Qso(S2,L2,O), S1 < S2.

Question 1.5

student 1 interest is $I_1$ and student 2 interest is $I_2$, we want $I_1 = I_2$,
- which is $(I_1 - I_2) \cup (I_2 - I_1) = \emptyset$.
- $\forall e: e \in I_1 \Leftrightarrow e \in I_2$

set in DL

table $R(A,B)$ and table $S(c,d)$ and we want $R - S$

1

Q(X,Y) :- R(X,Y), not S(X,Y).

set difference approach

1
2
3
4
5


Qsid(S) :- student(S,_,_,_,_).
QDiff(S1,S2) :- interest(S1,A1), Qsid(S2), not interest(S2,A1).
Qsymdiff(S1,S2) :- QDiff(S1,S2).
Qsymdiff(S1,S2) :- QDiff(S2,S1).
Q(S1,S2) :- Qsid(S1), Qsid(S2), not Qsymdiff(S1,S2), S1 < S2.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


WITH sids AS
    (SELECT sid FROM student),
  qdiff (SELECT i.sid AS sid1, s.sid AS sid2
         FROM interest i, sids s
         WHERE NOT EXISTS (SELECT * FROM interest i2
                            WHERE i2.activity = i.activity),
  qsymdiff AS (SELECT * FROM qdiff UNION SELECT sid2, sid1 FROM qdiff)
SELECT s1.sid AS sid1, s2.sid AS sid2
FROM sids s1, sids s2
WHERE NOT EXISTS (SELECT * FROM qsymdiff q WHERE s1.sid = q.sid1 AND s2.sid = q.sid2) AND s1.sid < s2.sid;

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


WITH sids AS
    (SELECT sid FROM student),
  qdiff (SELECT i.sid AS sid1, s.sid AS sid2
         FROM interest i, sids s
         WHERE NOT EXISTS (SELECT * FROM interest i2
                            WHERE i2.activity = i.activity)),
  qsymdiff AS (SELECT * FROM qdiff UNION SELECT sid2, sid1 FROM qdiff)
(SELECT s1.sid AS sid1, s2.sid AS sid2
FROM sids s1, sids s2
WHERE s1.sid < s2.sid)
EXCEPT
(SELECT * FROM qsymdiff q));

1
2
3
4


Q1(I1,I2,A) :- interest(I1,A), interest(I2,A).
Qextra(I1,I2) :- Q1(I1,I2,A), Q1(I1,I2,A2), interest(I1,A3), A <> A3, A <> A2.
Qextra(I1,I2) :- Q1(I1,I2,A), Q1(I1,I2,A2), interest(I2,A3), A2 <> A3, A <> A2.
Q(I1,I2) :- Q1(I1,I2,A), not Qextra(I1,I2).

students	activity
1	c1
1	c2
1	c3
2	c1
2	c2
2	c3

universal quantification

$\forall x: \phi(x) \Leftrightarrow \neg \exists x \neg \phi(x)$

<2022-03-28 Mon>

<2022-04-06 Wed>

time: (TID, year, month)
location: (LID, state, city)
fact: (TID, LID, num_call)

breakdown per time

total calls
calls per year
calls per month

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


-- total
SELECT sum(num_calls) AS total_calls FROM fact
-- total per year
SELECT sum(num_calls) AS total_calls, year
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY year
-- total per month
SELECT sum(num_calls) AS total_calls, year, month
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY year, month

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


WITH total_calls AS(
SELECT sum(num_calls) AS total_calls, NULL AS year, NULL AS month, 0 AS month_present, 0 AS year_present
  FROM fact),
year_calls AS (
SELECT sum(num_calls) AS total_calls, year, NULL AS month, 0 AS month_present, 1 AS year_present
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY year),
month_calls AS (
SELECT sum(num_calls) AS total_calls, year, month, 1 AS month_present, 1 AS year_present
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY year, month)
SELECT * FROM total_calls
UNION ALL
SELECT * FROM year_calls
UNION ALL
SELECT * FROM month_calls

reuse computation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


WITH
month_calls AS (
SELECT sum(num_calls) AS total_calls, year, month, 1 AS month_present, 1 AS year_present
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY year, month),

year_calls AS (
SELECT sum(total_calls) AS total_calls | year | NULL AS month | 0 AS month_present | 1 AS year_present |
FROM month_calls
GROUP BY year),

total_calls AS(
SELECT sum(total_calls) AS total_calls, NULL AS year, NULL AS month, 0 AS month_present, 0 AS year_present
  FROM year_calls)

SELECT * FROM total_calls
UNION ALL
SELECT * FROM year_calls
UNION ALL
SELECT * FROM month_calls

using special grouping constructs

1
2
3
4


SELECT sum(num_calls) AS total_calls, year, month, GROUPING(year) AS grp_year, GROUPING(month) AS grp_month
FROM time t, fact f
WHERE t.TID = f.TID
GROUP BY ROLLUP(year, month)

cumulative sums

for 2016, give me per month the cummulative calls per month

1
2
3
4
5
6


SELECT month, sum(total_calls) OVER (ORDER BY month) as cumul_calls
FROM
  (SELECT sum(num_calls) AS total_calls, month
     FROM fact f, time t
    WHERE f.TID = t.tid AND t.year = 2016
    GROUP BY month) monthly

write the query without window functions

1
2
3
4
5
6
7
8
9


WITH monthly AS (SELECT ROWNUM AS rid, sum(num_calls) AS total_calls, month
     FROM fact f, time t
    WHERE f.TID = t.tid AND t.year = 2016
    GROUP BY month
    ORDER BY month)
SELECT f.month, sum(s.total_calls)
FROM monthly f, monthly s
WHERE f.rid >= s.rid
GROUP BY f.month

for every year, give me per month the cummulative calls within that year

1
2
3
4
5


SELECT year, month, sum(total_calls) OVER (PARTITION BY year ORDER BY month) AS cumul_calls
FROM
  (SELECT sum(num_calls) AS total_calls, year, month
     FROM fact f, time t
    GROUP BY year, month) monthly

name of the student with the highest GPA: student(name,GPA,major)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


SELECT name
FROM student
ORDER BY gpa DESC
LIMIT 1; -- if multiple students have the highest gpa, the only one is retured

WITH maxgpa AS (SELECT gap AS mgpa FROM student ORDER BY gpa DESC LIMIT 1),
SELECT name
FROM student, maxgpa
WHERE GPA = mgpa;

WITH maxgpa AS (SELECT max(GPA) AS mgpa FROM student),
SELECT name
FROM student, maxgpa
WHERE GPA = mgpa;

SELECT name FROM (
(SELECT dense_rank() OVER (ORDER BY gpa DESC) AS rank, name FROM student)
WHERE rank = 1;

SELECT name
FROM (SELECT name, max(gpa) OVER () AS mgpa, gpa
      FROM student)
WHERE mgpa = gpa

<2022-04-13 Wed>

<2022-04-18 Mon>

failure during 1 hour: p = 0.001
probability that at least one of our 1000 machines fails during 1 hour:
- 1 - probability that none of them fails: (1-p)^1000 =~ 0.36
- 1 - p = 0.999
in general the probability to get exactly $m$ failures out of $n$ nodes: $$\binom{n}{m} \cdot p^m \cdot (1-p)^{n-m}$$
to get at least $m$ failures out of $n$ nodes $$\sum_{i=m}^{n} \binom{n}{i} \cdot p^i \cdot (1-p)^{n-i}$$

Last updated on 29 Dec 2021
Published on 29 Dec 2021