Notes taken during Lecture
Below you find the notes taken in class. I am also using hand written notes that you can find here: https://www.evernote.com/shard/s131/sh/07c1ca72-d8e4-4880-a1c4-191953171027/b0IkbmPwompLoW7JlrMI0MHpOSfq7UIyUETyFD9yOtLXZGYcMMPHsQxC4A
Class
design schemas (schema heterogenity)
attributes
- person: name, phone, dateofbirth
- address: streetname, zipcode, apt, city, state, country
table (relational database)
option 1
- person(pid,name, phone, dateofbirth, aid)
- address(aid,streetname, zipcode, apt, city, state, country)
-
constraints:
- FK:
person.aid
toaddress.aid
- PK:
person.pid
andaddress.aid
- FK:
option 2
- multiple addresses per person
- person(pid,name, phone, dateofbirth)
- address(aid,pid,streetname, zipcode, apt, city, state, country)
option 3
- multiple addresses per person
- person(pid,name, phone, dateofbirth)
- address(aid,streetname, zipcode, apt, city, state, country)
- livesat(pid,aid)
semi-structured (JSON)
|
|
Class
Boolean logic
syntax
- Set of Variable \(\mathbb{V}\)
-
atomic formula:
- variable
x
(take boolean values) - constant
{ FALSE, TRUE }
- variable
-
formula:
- any atomic formula is a formula
-
if \(\phi_1\) is a formula and \(\phi_2\) is a formula, then following are also formulas:
- \(\phi_1 \land \phi_2\) (AND)
- \(\phi_1 \lor \phi_2\) (OR)
- \(\neg \phi_1\) (NOT)
semantics
- Assignment \(\varphi: \mathbb{V} \to \{ \bot, \top \}\)
- Given assignment, a formula evaluate to either \(\{\bot,\top\}\)
- \(\varphi(x)\) = constant assigned to \(x\) by \(\varphi\)
- \(\varphi(c) = c\)
- \(\varphi(\phi_1 \land \phi_2) = \top\) if \(\varphi(\phi_1) = \top\) and \(\varphi(\phi_2) = \top\), otherwise return \(\bot\)
- \(\varphi(\phi_1 \lor \phi_2) = \top\) if \(\varphi(\phi_1) = \top\) or \(\varphi(\phi_2) = \top\), otherwise return \(\bot\)
- \(\varphi(\phi_1 \rightarrow \phi_2) = \top\) if \(\varphi(\phi_1) = \top\) and \(\varphi(\phi_2) = \top\) or \(\varphi(\phi_1) = \bot\), otherwise return \(\bot\)
- \(\varphi(\neg \phi_1) = \top\) if \(\varphi(\phi_1) = \bot\), otherwise \(\top\)
-
tautologies: formula is true for every assignment
- \(x \lor \neg x\)
-
satisfiability: if there exists an assignment for which the formula is true
- example satisfiable formula \(x \lor \neg x\)
- example not satisfiable formula \(x \land \neg x\)
Class
FO logic
examples
integers
- predicates \(<(\cdot, \cdot)\), \(-(\cdot, \cdot)\)
- domain \(\mathbb{D} = \{0,1,2,-1,-2\}\)
- model \(<\)
0 | 1 |
-1 | 1 |
… | … |
\(-\)
1 | -1 |
-1 | 1 |
0 | 0 |
2 | -2 |
… | … |
- \(\psi = \forall x: \exists y: -(x,y)\)
- \(\psi[x = 1]\) then \(\exists y: -(1,y)\), yes, for
y = -1
- \(\exists x: <(x,0)\)
- \(\forall x: \exists y: <(x,y)\)
- \(<(x,0)\), can be interpreted as a query (return all \(x\) smaller than 0)
relations
- database schema (predicates):
takes(sid,cid)
student(sid,name,major)
,course(cid,title, major)
takes
sid | cid |
---|---|
1 | 1 |
1 | 2 |
2 | 1 |
student
sid | name | major |
---|---|---|
1 | Peter | CS |
2 | Bob | CS |
course
cid | title | major |
---|---|---|
1 | CS520 | CS |
2 | CS101 | CS |
- \(\exists y,z: student(x,y,z)\)
|
|
- \(\exists y,z,a: student(x,y,z) \land takes(x,a)\)
|
|
- \(\forall x: \exists y,z,a: student(x,y,z) \rightarrow takes(x,a)\)
some logical equivalences
- \(a \vee (b \land c) \equiv (a \vee b) \land (a \lor c)\)
- \(\neg (a \land b) \equiv \neg a \lor \neg b\)
- \(\forall x: \phi(x) \equiv \neg \exists x: \neg \phi(x)\)
- \(x \rightarrow y \equiv \neg x \lor y\)
integrity constraints
FDs
- \(a \rightarrow b\)
- example FD
zip -> city
over tableaddress(zip,city)
*Expressing the FD in FO and rewritting the universal quantification into double negation to be able to express it in SQL *
- \(\forall x,y,z: address(x,y) \land address(x,z) \rightarrow y = z\)
- \(\neg \exists x,y,z: \neg( address(x,y) \land address(x,z) \rightarrow y = z)\) (\(\forall x: \phi(x) \equiv \neg \exists x: \neg \phi(x)\))
- \(\neg \exists x,y,z: \neg( \neg(address(x,y) \land address(x,z)) \lor (y = z))\) (\(x \rightarrow y \equiv \neg x \lor y\))
- \(\neg \exists x,y,z: (address(x,y) \land address(x,z) \land (y \neq z)\) (\(\neg \neg x \equiv x\) and \(\neg y = z \equiv y \neq z\))
Translation into SQL
|
|
zip | city |
---|---|
60616 | Chicago |
60616 | New York |
Datalog
|
|
|
|
connection
in | out |
---|---|
Lake | Madison |
Madison | IIT |
IIT | Hyde Park |
Hyde Park | IIT |
|
|
Q
X |
---|
IIT |
|
|
Q
X |
---|
IIT |
- 2 hops
|
|
- 3 hops
|
|
- 2 hops or 3 hops
|
|
- 3 hops that are not 2 hops
|
|
- unsafe w/o negation
|
|
- unsafe negation (all variables in negative atoms, have to appear in positive atoms in the body)
|
|
- recursion
|
|
- \(I_0 = I\)
- \(I_{i+1} = T_P(I_{i}) \cup I_{i}\)
- \(I^{*} = I_n\) such that \(I_n = I_{n-1}\)
|
|
|
|
Class
detecting constraint (FD) violations
zip -> city
P(zip,city)
- self join on LHS check that at least one RHS attribute is different
|
|
zip, county -> city, region
is equivalentzip, country -> city, zip, country -> region
A(zip,city,country,region)
|
|
find rows in violations
zip -> city
P(zip,city)
- self join on LHS check that at least one RHS attribute is different
|
|
find violations to denial constraints
- $\forall \vec{x}: \neg \Phi(\vec{x})$
- where $\Phi$ conjunction over relational atoms (predicates) and comparisons
E(eid,name,salary)
- $\forall \neg (E(x,y,z) \land z < 0)$
- $\forall \neg (E(\_,\_,z) \land z < 0)$
|
|
Class
Incomplete Database
- \(\mathbb{D} = \{D_1, \ldots, D_n\}\)
- \(Q(\mathbb{D}) = \{Q(D_1, \ldots, Q(D_n) \}\) possible answer semantics
-
\(certain(Q,\mathbb{D})\) - certain answer
- \(\{ t \mid \forall D \in \mathbb{D}: t \in Q(D) \}\)
Class
Class
|
|
|
|
Class
LOCAL SCHEMA ---------------------------------------- Person Name GLOBAL SCHEMA ---------------------------------------- P2 Name Age MAPPING ---------------------------------------- Person.Name -> P2.Name Person(X) = P2(X,Y).
Person
Name |
---|
Peter |
Bob |
P2
Name | Age |
---|---|
Peter | 1 |
Bob | 1 |
Name | Age |
---|---|
Peter | 90 |
Bob | 92 |
…
bag semantics example
Person(X) = P2(X,Y)
Person
Name |
---|
Peter |
Peter |
Bob |
Bob |
P2
Name | Age |
---|---|
Peter | 1 |
Peter | 12 |
Bob | 1 |
Bob | 1 |
Class
|
|
Testing query equivalence for query answering with views
|
|
-
test containment
1 2 3
q2unfold(T,Y,D) :- movie(I,T,Y,G), Y >= 1950, G=comedy, director(I,D), actor(I,D). q(T,Y,D) := movie(I,T,Y,G), Y >= 1950, G=comedy, director(I,D), actor(I,D).
- containment mapping
q2unfold
toq
I -> I T -> T Y -> Y G -> G D -> D
|
|
|
|
Class
Class
Query containment
- variable mapping $h$ from Q1 to Q2
- head mapped to head
- for each body atom \( R(\vec{x}) \) of Q1, \( h(R(\vec{x}) \in body(Q2) \)
- test head to head first, this give you a partial containment mapping or you detect failure early on
-
for body
-
have to map to goals with the same predicate
- start with goals whose predicates are unique in the query (or have very few occurrences)
- utilize the partial containment mapping from the head-to-head mapping
-
- \( Q_1 \rightarrow Q_4 \)
left var | right var |
---|---|
Z | A |
A | B |
X | C |
Y | C |
- \( Q_2 \to Q_1 \)
left var | right var |
---|---|
Y | |
Z | |
A | |
B |
- \( Q_4 \to Q_3 \)
left var | right var |
---|---|
A | X |
B | Y |
C |
|
|
- cannot map
XXX
to bothA
andZ
. - \( Q_4' \to Q_3 \)
|
|
left var | right var |
---|---|
A | X |
B | Y |
C | Z |
D | A |
|
|
- =Q4(c1,c2) :- r(c1,c2), r(c1,c2).
- \( Q_1 \to Q_3 \)
left var | right var |
---|---|
X | X |
Y | Y |
- \( Q_4 \to Q_1 \)
left var | right var |
---|---|
X | X |
Y | Y |
Z | Y |
|
|
Qa to Qb
left var | right var |
---|---|
X | N/A |
no containment mapping exists
Qb to Qa
left var | right var |
---|---|
X | X |
Y | X |
R(a1,a2)
a1 | a2 |
---|---|
1 | 1 |
1 | 2 |
Class
Constraint generation
Schema
- *Camera*(brand,price,model,shutterspeed,zoom)
- *Producer*(pid,name,revenue)
create FD
- create functional dependency
brand,model -> price
\[ \forall b,m,p_1,p_2,s_1,s_2,z_1,z_2: Camera(b,p_1,m,s_1,z_1) \land Camera(b,p_2,m,s_2,z_2) \rightarrow p_1 = p_2 \]
\[ \forall b_1,b_2,m_1,m_2,p_1,p_2,s_1,s_2,z_1,z_2: Camera(b_1,p_1,m_1,s_1,z_1) \land Camera(b_2,p_2,m_2,s_2,z_2) \] \[ \land b_1 = b_2 \land m_1 = m_2 \rightarrow p_1 = p_2 \]
create FK
- create FK
Camera.brand
toProducer.pid
\[ \forall b,p,m,s,z: Camera(b,p,m,s,z) \rightarrow \exists n,r: Producer(b,n,r) \]
non-FD and FK constraints
- if there are two cameras from the same brand and the first camera costs more than the second, then the shutterspeed of the first has to be higher than the shutterspeed of the second
\[ \forall: b_1,b_2,m_1,m_2,p_1,p_2,s_1,s_2,z_1,z_2: Camera(b_1,p_1,m_1,s_1,z_1) \land Camera(b_2,p_2,m_2,s_2,z_2) \] \[ \land b_1 = b_2 \land p_1 > p_2 \rightarrow s_1 > s_2 \]
- if there two cameras from the same brand with a price $p_1$ for the first and $p_1 < p_2$ for the second, then there has to exist a third camera from the same brand such that $p_1 < p_3 < p_2$
\[ \forall: b_1,b_2,m_1,m_2,p_1,p_2,s_1,s_2,z_1,z_2: Camera(b_1,p_1,m_1,s_1,z_1) \land Camera(b_2,p_2,m_2,s_2,z_2) \] \[ \land b_1 = b_2 \land p_1 < p_2 \rightarrow \exists m_3,p_3,s_3,z_3: Camera(b_1,p_3,m_3,s_3,z_3) \land p_1 < l_3 \land p_3 < p_2 \]
Containment mappings
|
|
Q1 to Q2
R(X,X)
does not exist (head to head enforcesX -> X
X -> X Z -> A ->
Q2 to Q1
X -> X Y -> X Z -> X
Q3 to Q1
X -> X Y -> X Z -> X A -> X
X -> X Y -> A Z -> X A -> Z
R(X,X), R(X,Z), R(Z,A)
Q2 to Q3
X -> X Y -> Z Z -> A
further examples
|
|
Q1 to Q2
- no mapping exists
X -> X
Q2 to Q1
X -> X
Schema matching
- Source: Person(Name, Address, Age)
- Target: P2(LastName, City, Salary)
- use only the schema (attribute names) and match if normalized edit distance < 0.6
Person
Name | address | age |
---|
P2
LastName | City | Salary |
---|
- only Name and LastName matches (every other pair has edit distance < 0.8
Datalog
Fall 2016 - 1.1.1
Write a Datalog program that returns product titles (attribute title of relation product).
|
|
Fall 2016 - 1.1.2
Write a Datalog program that returns the description and status of bugs for product “VCleaner”.
|
|
Fall 2016 - 1.1.4
Write a Datalog program that returns the names of supportes that are not assigned to any open bugs (attribute status).
|
|
universal quantification
Write a Datalog program the returns names of supporter that are assigned to all bug.
\[ \forall \vec{x}: \phi(\vec{x}) \Leftrightarrow \neg \exists \vec{x}: \neg \phi(\vec{x}) \]
|
|
recursive query
Write a Datalog program that returns pairs of supporters such that either they directly share a bug assignment (they are assigned to the same bug) or we can find a sequence of other supporters connecting them via shared bug assignments.
|
|
tree recursion
- table
Tree(parent,left,right)
- table
Root(node)
Tree
parent | left | right |
---|---|---|
1 | 2 | 3 |
2 | 4 | 5 |
5 | 6 | 7 |
Root
node |
---|
1 |
- compute
ca(X,Y,Z)
that contains allZ
such thatZ
is a common ancestor ofX
,Y
|
|
translation to relational algebra
- union the translation of all rules with the same head
- if there are multiple IDB predicates, then translate based on topological sort
-
how to translate a rule:
-
for positive rules (no negation)
- rename the attributes of goals in the body to match the names of variables of the goal (using the renaming operator $\rho$)
- join goals in the body, then selection for comparison predicates, then projection for head
-
= R(A,B)
|
|
- in this case we can skip the renaming step to simplify the query
\[ \pi_{A}(\sigma_{B < 10}(R)) \cup \pi_{B}(R) \]
- or using renaming:
\[ \pi_{X}(\sigma_{Y < 10}(\rho_{A\rightarrow X,B\rightarrow Y}(R))) \cup \pi_{X}(\rho_{A\rightarrow Z1,B\rightarrow X}(R)) \]
Class
universal data exchange solutions
\[ \forall x: S(x) \rightarrow \exists y: R(x,y). \]
I = { S(1), S(2) } J = { R(1,N1), R(2,N2) } iJ1 = { R(1,1), R(2,4) } J2 = { R(1,2), R(2,6), R(3,3) }
homomorphisms from J
to J1
and J2
h1: 1 -> 1 2 -> 2 N1 -> 1 N2 -> 4
h2: 1 -> 1 2 -> 2 N1 -> 2 N2 -> 6
query evaluation
Q() :- R(1,X).
- evaluation over universal solution
J
- compose
h
withh1
h: X -> N1 Q() :- R(1,N1).
h1(h): X -> 1 Q() :- R(1,1).
Class
star schema
dimension tables
Location(lid, country, state, city , zip, shop)
, PK:lid
lid | country | state | city | zip | shop |
---|---|---|---|---|---|
1 | USA | IL | Chicago | 60615 | branch#1 |
2 | USA | IL | Chicago | 60615 | branch#2 |
.. | … | … | … | … | … |
1000234 | USA | NY | Buffalo | 45554 | branch#1412 |
Time(tid,year,month,day)
, PKtid
tid | year | month | day |
---|---|---|---|
1 | 2015 | 01 | 01 |
2 | 2015 | 01 | 02 |
product(pid,type, brand, product)
, PKpid
pid | type | brand | product |
---|---|---|---|
1 | coffee | tschibo | soft blend |
fact table
-
fact(lid,tid,pid,cost,revenue)
PK:lid,tid,pid
- FK
lid
tolocation
- FK
tid
totime
- FK
pid
toproduct
- FK
lid | tid | pid | cost | sales | revenue |
---|---|---|---|---|---|
2 | 1 | 1 | 0.50 | 23 | 3.50 |
grouping queries
give me the total sales, the sales per year, and the sales per month
|
|
|
|
-
ambiguity:
NULL
has two meanings:- we did not group on this attribute
- the group-by value is
NULL
|
|
|
|
|
|
Class
|
|
- equivalent to
|
|
- we have the result grouped on
year
(query 2)
\[ a_1 + a_2 + a_3 + b_1 + b_2 + b_3 + b_4 \]
\[ a < b \Rightarrow h(a) < h(b) \]
\[ A = c \Rightarrow h(A) = h(c) \]
\[ n \cdot \log n \]
- 10 partitions
\[ 10 \cdot (\frac{n}{10} \cdot \log \frac{n}{10}) = n \cdot \log \frac{n}{10} \]
Class
- machine chance of failure
p
per hour - 1000 machine, probability that none fails
- for independent events:
\[ P(A \cap B) = P(A) \cdot P(B) \]
\[ P(none) = (1-p)^{1000} \]
- 1000 machines, the probability of at one failing (assuming independence)
\[ P(\text{1-or-more}) = 1 - P(none) = 1 - (1- p)^{1000} \]
|
|
0.6323045752290363
Class
A | B |
---|---|
1 | 2 |
3 | 5 |
1 | 7 |
3 | 6 |
|
|
A | B |
---|---|
1 | 2 |
1 | 7 |
3 | 5 |
3 | 6 |
word count
- input
{(id,text)}
where text is a document - output
{(word,cnt)}
wherecnt
is the number of times the word appears in all the doc - map:
{(id,text)} -> { (word, 1) | word in text.split(" ") }
- reduce:
{(word,1), ...., (word,1), ...} -> {(word, SUM(v1, .., vn))}
-(1,"The green fox jumped over the moon") -> {(The,1), (green,1), (fox,1), (jumped,1) ...}
Class
relational operators as map reduce jobs
-
selection: just map $f$
- $\sigma_\theta$ as $f(t) = \{t\}$ if $t \models \theta$ else $\emptyset$
-
projection: just map $f$
- $\Pi_A$ as $f(t) = \{t.A\}$
-
aggregation: mapper $f$ and reducer $g$
- $\gamma_{G,agg(a)}$ $f(t) = \{(t.G, a)\}$ and $g((k,[a_1, \ldots, a_n])) = \{ (k,agg([a_1, \ldots, a_n])) \}$
- $\gamma_{b,sum(a)}$
-
join: both map $f$ and reduce $g$
- $R \bowtie_{R.a = S.b} S$
- $f_R(t) = \{ (t.a, (1,t)) \}$
- $f_S(t) = \{(t.b,(2,t)) \}$
- $g$
R
a | c |
---|---|
1 | 1 |
1 | 2 |
2 | 4 |
S
b | d |
---|---|
1 | 4 |
1 | 5 |
3 | 5 |
key | value |
---|---|
1 | (1,1,1) |
1 | (1,1,2) |
1 | (2,1,4) |
1 | (2,1,5) |
2 | (1,2,4) |
3 | (2,3,5) |
- union:
Class
A | B | TID |
---|---|---|
1 | 2 | t1 |
1 | 3 | t2 |
1 | 4 | t3 |
2 | 3 | t4 |
|
|
|
|
\[ \pi_{A}(R) \]
A | TID |
---|---|
1 | o1 |
2 | o2 |
\[ WHY(Q,D,o_1) = \{ \{t_1\}, \{t_2\}, \{t_3\}, \{t_1, t_2\}, \{t_2,t_3\}, \{t_1,t_4\}, \ldots\} \]
\[ MWHY(Q,D,o_1) = \{ \{t_1\}, \{t_2\}, \{t_3\} \} \]
Class
\[ D'\,\text{is minimal if}\, \forall D'': D'' \subset D': D''\,\text{is not a witness} \]
Brute force algorithm (MWhy)
- Input: database \(D\), query \(Q\), and tuple \(t \in Q(D)\)
- Output: \(MWhy(Q,D,t)\)
|
|
- runtime complexity: \(O(2^n)\) for a database \(D\) with \(n\) rows
derive W (set of witnesses) using the top-down rules.
\(\pi_A(R)\)
- rule for \(\pi\)
\[ W(\pi_{A}(R),o_1,D) = W(R,t_1,D) \cup W(R,t_2,D) \cup W(R,t_3,D) \]
- rule table access \(R\)
\[ = \{\{t_1\}\} \cup \{\{t_2\}\} \cup \{\{t_3\}\} \]
\[ = \{\{t_1\}, \{t_2\}, \{t_3\}\} \]
\(\pi_{A}(R) \bowtie_{A=C} S\)
R
A | B | tid |
---|---|---|
1 | 10 | r1 |
1 | 20 | r2 |
2 | 30 | r3 |
3 | 40 | r4 |
S
C | D | tid |
---|---|---|
1 | a | s1 |
1 | b | s2 |
2 | c | s3 |
$\pi_C(S)$
C | tid |
---|---|
1 | c1 |
2 | c2 |
\[\pi_{A}(R)\]
A | tid |
---|---|
1 | p1 |
2 | p2 |
3 | p3 |
\(Q_{j} = \pi_{A}(R) \bowtie_{A=C} \pi_{C}(S)\)
A | C | tid |
---|---|---|
1 | 1 | o1 |
2 | 2 | o3 |
- \(W(Q_j, o1)\)
- rule for \(\bowtie\)
\[ W(\pi_{A}(R) \bowtie_{A=C} \pi_{C}(S), o_1) = \{ (w_1 \cup w_2) \mid w_1 \in W(\pi_A(R), p_1) \land w_2 \in W(\pi_{C}(S),c_1) \} \]
- rule for \(\pi\) and \(R\) (table access)
\[ W(\pi_{A}(R) \bowtie_{A=C} \pi_{C}(S), o_1) = \{ (w_1 \cup w_2) \mid w_1 \in \{\{r_1\},\{r_2\}\} \land w_2 \in \{\{s_1\},\{s_2\}\} \} \]
\[ W(\pi_{A}(R) \bowtie_{A=C} \pi_{C}(S), o_1) = \{ \{r_1,s_1\},\{r_2,s_1\} \} \cup \{\{r_1,s_2\},\{r_2,s_2\}\} \] \[ = \{ \{r_1,s_1\},\{r_2,s_1\}, \{r_1,s_2\},\{r_2,s_2\}\} \]
annotated databases
natural numbers \(\mathbb{N}\) (bag semantics)
R
A | B |
---|---|
1 | 1 |
1 | 1 |
1 | 2 |
2 | 2 |
2 | 2 |
R using \(\mathbb{N}\) annotation
A | B | \(\mathbb{N}\) |
---|---|---|
1 | 1 | 2 |
1 | 2 | 1 |
2 | 2 | 2 |
MWhy annotations \(\mathcal{P}(\mathcal{P}(D))\)
A | C | \(\mathcal{P}(\mathcal{P}(D))\) |
---|---|---|
1 | 1 | \(\{ \{r_1,s_1\},\{r_2,s_1\}, \{r_1,s_2\},\{r_2,s_2\}\}\) |
2 | 2 | \(\{\{r_3,s_3\}\}\) |
\(\mathbb{N}\)-relation
- \(\mathcal{U} = \{1,2\}\)
- 2-ary \(\mathbb{N}\) -relation over \(\mathcal{U}\)
R
A | B |
---|---|
1 | 1 |
1 | 1 |
2 | 2 |
R \(\mathbb{N}\) -relation
- \(R((1,1)) = 2\)
- \(R((1,2)) = 0\)
- \(R((2,1)) = 0\)
- \(R((2,2)) = 1\)
A | B | \(\mathbb{N}\) |
---|---|---|
1 | 1 | 2 |
2 | 2 | 1 |
\(\mathcal{P}(\mathcal{P}(D))\), \(\emptyset\) as zero
A | C | \(\mathcal{P}(\mathcal{P}(D))\) |
---|---|---|
1 | 1 | \(\{ \{r_1,s_1\},\{r_2,s_1\}, \{r_1,s_2\},\{r_2,s_2\}\}\) |
2 | 2 | \(\{\{r_3,s_3\}\}\) |
1 | 2 | \(\emptyset\) |
2 | 2 | \(\emptyset\) |
Semiring \(\mathbb{B}\)
- \(\mathbb{B} = (\{false,true\}, \lor, \land, false, true)\)
- \(k_1 \lor k_2 = k_2 \lor k_1\)
\(k_1\) | \(k_2\) | \(k_1 \lor k_2\) | \(k_2 \lor k_1\) |
---|---|---|---|
false | false | false | false |
false | true | true | true |
true | false | true | true |
true | true | true | true |
- \(k \lor false = k\)
- \(k \land true = k\)
- \(k \land false = false\)
- \(f(x) = 2x\)
Queries over annotated DBs
\(\mathbb{B}\) (set semantics)
- \(\mathbb{B} = (\{false,true\}, \lor, \land, false, true)\)
R
A | B | \(\mathbb{B}\) |
---|---|---|
1 | 1 | true |
1 | 2 | true |
1 | 3 | true |
2 | 2 | true |
\(\pi_A(\sigma_{B < 3}(R))\)
- \(\sigma_{B < 3}(R)\)
A | B | \(\mathbb{B}\) |
---|---|---|
1 | 1 | \(true \land true = true\) |
1 | 2 | \(true \land true = true\) |
1 | 3 | \(true \land false = false\) |
2 | 2 | \(true \land true = true\) |
- \(\pi_A(\sigma_{B < 3}(R))\)
A | \(\mathbb{B}\) |
---|---|
1 | \(true \lor true \lor false = true\) |
2 | \(true\) |
\(\mathbb{N}\) (bag semantics)
- \(\mathbb{N} = (\mathbb{N}, +, \cdot, 0, 1)\)
R
A | B | \(\mathbb{N}\) |
---|---|---|
1 | 1 | 2 |
1 | 2 | 4 |
1 | 3 | 1 |
2 | 2 | 5 |
\(\pi_A(\sigma_{B < 3}(R))\)
- \(\sigma_{B < 3}(R)\)
A | B | \(\mathbb{N}\) |
---|---|---|
1 | 1 | \(2 \cdot 1 = 2\) |
1 | 2 | \(4 \cdot 1 = 4\) |
1 | 3 | \(1 \cdot 0 = 0\) |
2 | 2 | \(5 \cdot 1 = 5\) |
- \(\pi_A(\sigma_{B < 3}(R))\)
A | \(\mathbb{N}\) |
---|---|
1 | \(2 + 4 + 0 = 6\) |
2 | \(5\) |
A | B |
---|---|
1 | 1 |
1 | 1 |
1 | 2 |
1 | 2 |
1 | 2 |
1 | 2 |
1 | 3 |
2 | 2 |
2 | 2 |
2 | 2 |
2 | 2 |
2 | 2 |
\(\pi_A(\sigma_{B < 3}(R))\)
A |
---|
1 |
1 |
1 |
1 |
1 |
1 |
2 |
2 |
2 |
2 |
2 |
Class
R
A | B | TID |
---|---|---|
1 | 1 | t1 |
1 | 2 | t2 |
1 | 3 | t3 |
\(\pi_A(R)\)
A | TID |
---|---|
1 | o1 |
\[ Why(Q,D,o_1) = \{ \{t_1\}, \{t_2\}, \{t_3\}, \{t_1, t_2\}, \{t_1,t_3\}, \{t_2,t_3\}, \{t_1, t_2, t_3\} \} \]
n = 3
subsets = 2^n
nonsubsets = subsets - 1
n = 3 => n: 3 subsets = 2^n => subsets: 8 nonsubsets = subsets - 1 => nonsubsets: 7
nonsubsets: 7
n = 10000
subsets = 2^n
nonsubsets = subsets - 1
n = 10000 => n: 10,000 subsets = 2^n => subsets: 19,950,631,168,807,583,848,837,421,626,835,850,838,234,968,318,861,924,548,520,089,498,529,438,830,221,946,631,919,961,684,036,194,597,899,331,129,423,209,124,271,556,491,349,413,781,117,593,785,932,096,323,957,855,730,046,793,794,526,765,246,551,266,059,895,520,550,086,918,193,311,542,508,608,460,618,104,685,509,074,866,089,624,888,090,489,894,838,009,253,941,633,257,850,621,568,309,473,902,556,912,388,065,225,096,643,874,441,046,759,871,626,985,453,222,868,538,161,694,315,775,629,640,762,836,880,760,732,228,535,091,641,476,183,956,381,458,969,463,899,410,840,960,536,267,821,064,621,427,333,394,036,525,565,649,530,603,142,680,234,969,400,335,934,316,651,459,297,773,279,665,775,606,172,582,031,407,994,198,179,607,378,245,683,762,280,037,302,885,487,251,900,834,464,581,454,650,557,929,601,414,833,921,615,734,588,139,257,095,379,769,119,277,800,826,957,735,674,444,123,062,018,757,836,325,502,728,323,789,270,710,373,802,866,393,031,428,133,241,401,624,195,671,690,574,061,419,654,342,324,638,801,248,856,147,305,207,431,992,259,611,796,250,130,992,860,241,708,340,807,605,932,320,161,268,492,288,496,255,841,312,844,061,536,738,951,487,114,256,315,111,089,745,514,203,313,820,202,931,640,957,596,464,756,010,405,845,841,566,072,044,962,867,016,515,061,920,631,004,186,422,275,908,670,900,574,606,417,856,951,911,456,055,068,251,250,406,007,519,842,261,898,059,237,118,054,444,788,072,906,395,242,548,339,221,982,707,404,473,162,376,760,846,613,033,778,706,039,803,413,197,133,493,654,622,700,563,169,937,455,508,241,780,972,810,983,291,314,403,571,877,524,768,509,857,276,937,926,433,221,599,399,876,886,660,808,368,837,838,027,643,282,775,172,273,657,572,744,784,112,294,389,733,810,861,607,423,253,291,974,813,120,197,604,178,281,965,697,475,898,164,531,258,434,135,959,862,784,130,128,185,406,283,476,649,088,690,521,047,580,882,615,823,961,985,770,122,407,044,330,583,075,869,039,319,604,603,404,973,156,583,208,672,105,913,300,903,752,823,415,539,745,394,397,715,257,455,290,510,212,310,947,321,610,753,474,825,740,775,273,986,348,298,498,340,756,937,955,646,638,621,874,569,499,279,016,572,103,701,364,433,135,817,214,311,791,398,222,983,845,847,334,440,270,964,182,851,005,072,927,748,364,550,578,634,501,100,852,987,812,389,473,928,699,540,834,346,158,807,043,959,118,985,815,145,779,177,143,619,698,728,131,459,483,783,202,081,474,982,171,858,011,389,071,228,250,905,826,817,436,220,577,475,921,417,653,715,687,725,614,904,582,904,992,461,028,630,081,535,583,308,130,101,987,675,856,234,343,538,955,409,175,623,400,844,887,526,162,643,568,648,833,519,463,720,377,293,240,094,456,246,923,254,350,400,678,027,273,837,755,376,406,726,898,636,241,037,491,410,966,718,557,050,759,098,100,246,789,880,178,271,925,953,381,282,421,954,028,302,759,408,448,955,014,676,668,389,697,996,886,241,636,313,376,393,903,373,455,801,407,636,741,877,711,055,384,225,739,499,110,186,468,219,696,581,651,485,130,494,222,369,947,714,763,069,155,468,217,682,876,200,362,777,257,723,781,365,331,611,196,811,280,792,669,481,887,201,298,643,660,768,551,639,860,534,602,297,871,557,517,947,385,246,369,446,923,087,894,265,948,217,008,051,120,322,365,496,288,169,035,739,121,368,338,393,591,756,418,733,850,510,970,271,613,915,439,590,991,598,154,654,417,336,311,656,936,031,122,249,937,969,999,226,781,732,358,023,111,862,644,575,299,135,758,175,008,199,839,236,284,615,249,881,088,960,232,244,362,173,771,618,086,357,015,468,484,058,622,329,792,853,875,623,486,556,440,536,962,622,018,963,571,028,812,361,567,512,543,338,303,270,029,097,668,650,568,557,157,505,516,727,518,899,194,129,711,337,690,149,916,181,315,171,544,007,728,650,573,189,557,450,920,330,185,304,847,113,818,315,407,324,053,319,038,462,084,036,421,763,703,911,550,639,789,000,742,853,672,196,280,903,477,974,533,320,468,368,795,868,580,237,952,218,629,120,080,742,819,551,317,948,157,624,448,298,518,461,509,704,888,027,274,721,574,688,131,594,750,409,732,115,080,498,190,455,803,416,826,949,787,141,316,063,210,686,391,511,681,774,304,792,596,709,376 nonsubsets = subsets - 1 => nonsubsets: 19,950,631,168,807,583,848,837,421,626,835,850,838,234,968,318,861,924,548,520,089,498,529,438,830,221,946,631,919,961,684,036,194,597,899,331,129,423,209,124,271,556,491,349,413,781,117,593,785,932,096,323,957,855,730,046,793,794,526,765,246,551,266,059,895,520,550,086,918,193,311,542,508,608,460,618,104,685,509,074,866,089,624,888,090,489,894,838,009,253,941,633,257,850,621,568,309,473,902,556,912,388,065,225,096,643,874,441,046,759,871,626,985,453,222,868,538,161,694,315,775,629,640,762,836,880,760,732,228,535,091,641,476,183,956,381,458,969,463,899,410,840,960,536,267,821,064,621,427,333,394,036,525,565,649,530,603,142,680,234,969,400,335,934,316,651,459,297,773,279,665,775,606,172,582,031,407,994,198,179,607,378,245,683,762,280,037,302,885,487,251,900,834,464,581,454,650,557,929,601,414,833,921,615,734,588,139,257,095,379,769,119,277,800,826,957,735,674,444,123,062,018,757,836,325,502,728,323,789,270,710,373,802,866,393,031,428,133,241,401,624,195,671,690,574,061,419,654,342,324,638,801,248,856,147,305,207,431,992,259,611,796,250,130,992,860,241,708,340,807,605,932,320,161,268,492,288,496,255,841,312,844,061,536,738,951,487,114,256,315,111,089,745,514,203,313,820,202,931,640,957,596,464,756,010,405,845,841,566,072,044,962,867,016,515,061,920,631,004,186,422,275,908,670,900,574,606,417,856,951,911,456,055,068,251,250,406,007,519,842,261,898,059,237,118,054,444,788,072,906,395,242,548,339,221,982,707,404,473,162,376,760,846,613,033,778,706,039,803,413,197,133,493,654,622,700,563,169,937,455,508,241,780,972,810,983,291,314,403,571,877,524,768,509,857,276,937,926,433,221,599,399,876,886,660,808,368,837,838,027,643,282,775,172,273,657,572,744,784,112,294,389,733,810,861,607,423,253,291,974,813,120,197,604,178,281,965,697,475,898,164,531,258,434,135,959,862,784,130,128,185,406,283,476,649,088,690,521,047,580,882,615,823,961,985,770,122,407,044,330,583,075,869,039,319,604,603,404,973,156,583,208,672,105,913,300,903,752,823,415,539,745,394,397,715,257,455,290,510,212,310,947,321,610,753,474,825,740,775,273,986,348,298,498,340,756,937,955,646,638,621,874,569,499,279,016,572,103,701,364,433,135,817,214,311,791,398,222,983,845,847,334,440,270,964,182,851,005,072,927,748,364,550,578,634,501,100,852,987,812,389,473,928,699,540,834,346,158,807,043,959,118,985,815,145,779,177,143,619,698,728,131,459,483,783,202,081,474,982,171,858,011,389,071,228,250,905,826,817,436,220,577,475,921,417,653,715,687,725,614,904,582,904,992,461,028,630,081,535,583,308,130,101,987,675,856,234,343,538,955,409,175,623,400,844,887,526,162,643,568,648,833,519,463,720,377,293,240,094,456,246,923,254,350,400,678,027,273,837,755,376,406,726,898,636,241,037,491,410,966,718,557,050,759,098,100,246,789,880,178,271,925,953,381,282,421,954,028,302,759,408,448,955,014,676,668,389,697,996,886,241,636,313,376,393,903,373,455,801,407,636,741,877,711,055,384,225,739,499,110,186,468,219,696,581,651,485,130,494,222,369,947,714,763,069,155,468,217,682,876,200,362,777,257,723,781,365,331,611,196,811,280,792,669,481,887,201,298,643,660,768,551,639,860,534,602,297,871,557,517,947,385,246,369,446,923,087,894,265,948,217,008,051,120,322,365,496,288,169,035,739,121,368,338,393,591,756,418,733,850,510,970,271,613,915,439,590,991,598,154,654,417,336,311,656,936,031,122,249,937,969,999,226,781,732,358,023,111,862,644,575,299,135,758,175,008,199,839,236,284,615,249,881,088,960,232,244,362,173,771,618,086,357,015,468,484,058,622,329,792,853,875,623,486,556,440,536,962,622,018,963,571,028,812,361,567,512,543,338,303,270,029,097,668,650,568,557,157,505,516,727,518,899,194,129,711,337,690,149,916,181,315,171,544,007,728,650,573,189,557,450,920,330,185,304,847,113,818,315,407,324,053,319,038,462,084,036,421,763,703,911,550,639,789,000,742,853,672,196,280,903,477,974,533,320,468,368,795,868,580,237,952,218,629,120,080,742,819,551,317,948,157,624,448,298,518,461,509,704,888,027,274,721,574,688,131,594,750,409,732,115,080,498,190,455,803,416,826,949,787,141,316,063,210,686,391,511,681,774,304,792,596,709,375
- minimize
\[ MWhy(Q,D,o_1) = \{ \{t_1\}, \{t_2\}, \{t_3\} \} \]
R
A | B | TID |
---|---|---|
a | 1 | r1 |
a | 2 | r2 |
b | 1 | r3 |
S
C | D | TID |
---|---|---|
1 | d | s1 |
2 | d | s2 |
3 | d | s3 |
\[ \pi_{A,D}(R \bowtie_{B=C} S) \]
A | D | TID |
---|---|---|
a | d | o1 |
b | d | o2 |
\[ MWhy(Q,o_1) = \{ \{r_1,s_1\}, \{r_2,s_2\} \} \]
\[ MWhy(Q,o_2) = \{ \{r_3, s_1\} \} \]
\[ D' \subset D: Q(D') = Q(D) = o_1 \in Q(D') \land o_2 \in Q(D') \]
\[ MWhy(Q) = \{ \{r_1,r_3,s_1\}, \{r_2,s_2,r_3,s_1\} \} \]
Class
\[ \pi_{A}(R) \]
R
A | B | \(\mathbb{N}[X]\) |
---|---|---|
1 | 1 | x1 |
1 | 2 | x2 |
1 | 3 | x3 |
2 | 1 | x4 |
A | \(\mathbb{N}[X]\) |
---|---|
1 | x1 + x2 + x3 |
2 | x4 |
- \(\mathbb{B}\) - set semantics
A | B | \(\mathbb{N}[X]\) |
---|---|---|
1 | 1 | x1=true |
1 | 2 | x2=false |
1 | 3 | x3=true |
2 | 1 | x4=false |
A | B |
---|---|
1 | 1 |
1 | 3 |
A | \(\mathbb{B}\) |
---|---|
1 | \(true \lor false \lor true = true\) |
2 | \(false\) |
- \(\mathbb{N}\) - bag semantics
A | B |
---|---|
1 | 1 |
1 | 1 |
1 | 2 |
1 | 3 |
1 | 3 |
1 | 3 |
2 | 1 |
2 | 1 |
A | B | \(\mathbb{N}[X]\) |
---|---|---|
1 | 1 | x1=2 |
1 | 2 | x2=1 |
1 | 3 | x3=3 |
2 | 1 | x4=2 |
A | \(\mathbb{N}[X]\) |
---|---|
1 | 2 + 1 + 3 = 6 |
2 | 2 |
R
A | \(\mathbb{N}[X]\) |
---|---|
1 | r1 |
2 | r2 |
3 | r3 |
S
C | \(\mathbb{N}[X]\) |
---|---|
1 | s1 |
2 | s2 |
4 | s3 |
- \[R \cup S\]
A | \(\mathbb{N}[X]\) |
---|---|
1 | r1 + s1 |
2 | r2 + s2 |
3 | r3 |
4 | s3 |
- \[R \bowtie_{A=C} S\]
A | C | \(\mathbb{N}[X]\) |
---|---|---|
1 | 1 | r1 * s1 |
2 | 2 | r2 * s2 |
CAP Theorem
- Consistency
- Availability
- Partition tolerance
S1
item | counter |
---|---|
coke | 2 |
beer | 2 |
S2
item | counter |
---|---|
coke | 3 |
beer | 1 |
S1 merge S2
item | counter |
---|---|
coke | 3 |
beer | 2 |