Notes taken during Lecture
Below you find the notes taken in class. I am also using hand written notes that you can find here: https://www.evernote.com/l/AIOM1BQDeRBPzJzmVaJsngnE4-8CYA48p7g
- https://jamboard.google.com/d/1x6G5NTMQuP4TjquAypV7lOPQ3yZWuil1u826cjq__LE/edit?usp=sharing
- https://discord.gg/hsdj4Xt8CA
design a person and address schema
SQL
|
|
Person
Name | SSN | zip | city |
---|---|---|---|
Peter | 111 | 60616 | Chicago |
Peter | 111 | 60657 | Chicago |
Alice | 222 | 60615 | Chicago |
Bob | 333 | 60657 | Chicago |
JSON
|
|
FO logic
signature
animal(X)
andplan(X)
andbird(X)
andcanfly(X)
andcanswim(X)
model
- $\mathbb{D} = \{ penguin, swallow, cow, human, tree \}$
- $bird =\{ (swallow), (penguin) \}$
- $animal = \{ (cow), (swallow), (penguin), (human) \}$
- $plant = \{ (tree) \}$
- $canfly = \{ (swallow) \}$
- $canswin = \{ (penguin) \}$
- $equals = \{ (swallow, swallow), (penguin, penguin), \ldots$
formulas
Can Swallows fly?
$$canfly(swallow)$$ $$\forall s: swallow(s) \rightarrow canfly(s)$$ $$\forall x: equals(x,swallow) \rightarrow canfly(x)$$
All birds can fly
$$\forall x: canfly(x)$$ - this does not work, it requires that everything can fly not just birds
$$\forall x: bird(x) \rightarrow canfly(x)$$
No plant can fly
$$\forall x: plant(x) \rightarrow \neg canfly(x)$$
|
|
|
|
a -> b -> c
a - b - c
|
|
Variable mapping Q to Q': $\Psi_1$ = { X -> SOMETHING, Y -> A }
- mapping head to head:
{ X -> A, Y -> A }
- body contained in body: $\Psi_1(R(X,Y))$ =
R(A,A)
- mapping head to head:
{ X -> A, Y -> SOMETHING }
- body cantained in body: $\Psi_2(R(X,Y))$ =
R(A,SOMETHING)
Q1() :- R(a,b), R(c,b).
Q2() :- R(x,y).
-
Containment mapping:
x-> a, y -> b
- database D
{ R(1,1) }
- database D
-
a -> 1, b -> 1, c -> 1
x -> 1, y -> 1
Q1() :- R(1,1), R(1,1).
Q1() :- R(a,b), R(c,b).
Q2() :- R(x,y).
Does DB fulfill constraint
R: A -> B
|
|
- this does not work:
|
|
extend the example
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Question 1.1
|
|
Question 1.2
|
|
Question 1.3
|
|
Question 1.4
|
|
Question 1.5
-
student 1 interest is $I_1$ and student 2 interest is $I_2$, we want $I_1 = I_2$,
- which is $(I_1 - I_2) \cup (I_2 - I_1) = \emptyset$.
- $\forall e: e \in I_1 \Leftrightarrow e \in I_2$
set in DL
- table $R(A,B)$ and table $S(c,d)$ and we want $R - S$
|
|
set difference approach
|
|
|
|
|
|
|
|
students | activity |
---|---|
1 | c1 |
1 | c2 |
1 | c3 |
2 | c1 |
2 | c2 |
2 | c3 |
universal quantification
- $\forall x: \phi(x) \Leftrightarrow \neg \exists x \neg \phi(x)$
- time:
(TID, year, month)
- location:
(LID, state, city)
- fact:
(TID, LID, num_call)
breakdown per time
- total calls
- calls per year
- calls per month
|
|
|
|
- reuse computation
|
|
- using special grouping constructs
|
|
cumulative sums
- for 2016, give me per month the cummulative calls per month
|
|
- write the query without window functions
|
|
- for every year, give me per month the cummulative calls within that year
|
|
-
name of the student with the highest GPA:
student(name,GPA,major)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
SELECT name FROM student ORDER BY gpa DESC LIMIT 1; -- if multiple students have the highest gpa, the only one is retured WITH maxgpa AS (SELECT gap AS mgpa FROM student ORDER BY gpa DESC LIMIT 1), SELECT name FROM student, maxgpa WHERE GPA = mgpa; WITH maxgpa AS (SELECT max(GPA) AS mgpa FROM student), SELECT name FROM student, maxgpa WHERE GPA = mgpa; SELECT name FROM ( (SELECT dense_rank() OVER (ORDER BY gpa DESC) AS rank, name FROM student) WHERE rank = 1; SELECT name FROM (SELECT name, max(gpa) OVER () AS mgpa, gpa FROM student) WHERE mgpa = gpa
- failure during 1 hour: p = 0.001
-
probability that at least one of our 1000 machines fails during 1 hour:
- 1 - probability that none of them fails: (1-p)^1000 =~ 0.36
- 1 - p = 0.999
- in general the probability to get exactly $m$ failures out of $n$ nodes: $$\binom{n}{m} \cdot p^m \cdot (1-p)^{n-m}$$
- to get at least $m$ failures out of $n$ nodes $$\sum_{i=m}^{n} \binom{n}{i} \cdot p^i \cdot (1-p)^{n-i}$$