Notes From CS Undergrad Courses FSU
This project is maintained by awa03
Systems, algorithms, and processes managing and deriving insights from heterogeneous and/or large data science.
Data scientists spend the majority of time performing the first two stages. Data fusion is long and tedious. Most data will be unstructured. Data management tools are very useful, and will make data management as well as processing much more efficient (AWS, Azure for example).
A Relation R is a subset of $S_1 x S_2 x ... x S_N$ where $S_1$ is Domain of attribute i in [1, n]
, and n is a number of attributes of R
A Tuple t is an element of $S_1xS_2x...xS_2$
Relation Schema