Relational operators

Relational algebra, first described by E.F. Codd, has a well-founded semantics used for modeling data, and defining operations on it. Relational algebra is the theoretical basis of the relational databases and SQL (Structured Query Language) that is used to query such databases.

Data modeled as sets of tuples (often referred to as tables) can be manipulated using the so-called relational operators. Some of these operators are directly based on the ones defined by Codd, like select, project or various joins. Others are based more loosely on Codd's operators, like the fold operation that can be used to perform aggregate operations or partition that can be used to split a table into multiple tables. In this section, we will discuss the relational operators in detail. Since SQL is a very popular language, and many readers will be already familiar with it, we will also provide comparisons between Rax relational operators and SQL constructs.

Projection

Projection is an operation defined by Codd in relational algebra and is denoted there by the symbol π. Projection is formally defined as a unary operation, πa1,...,an(R) where R is a relation (i.e., a set of tuples) and a1,...,an is a set of attribute names. The result of such projection is defined as the set of tuples obtained when all tuples in R are restricted to the set {a1,...,an}. In the table terminology, we say that projections cuts out a subset of columns from a table.

Like in SQL, projection in Rax is more powerful than in relational algebra: it not only allows to cut out existing columns, but also to create new columns based on the existing columns. For example:

    {[#:id, $:name, @:date_of_birth]}: people :=
    {
      [1, "Gosia Wrzesinska", (@)"1978-02-08"],
      [2, "Ala Makota", (@)"1989-06-04"]
    };
    `print project [.#1, .name, ((@)"now" ~ .date_of_birth).r'months/12] people;

    // Output:
    //: #1|        #2        |#3
    //: --|------------------|--
    //:  1|"Gosia Wrzesinska"|36
    //:  2|   "Ala Makota"   |25
             

In the above example, we use Rax's project operator, to compute ages of people listed in table people. The project statement in Rax has the following syntax:

project[ <list_of_expressions_with_tuple_fields> ] <set_expression>

The list of expressions in project can contain arbitrary expressions that can (but do not have to) refer to the tuple fields in the set expression. The tuple field reference can be in a form of a number (.#1), field label (.name) or field label and number. The set expression must evaluate to a set, but not necessarily to a set of tuples. For example, this is also a valid project expression:

    `print project [.#1 * 2] {1..10};    // Output: {2,4,6,8,10,12,14,16,18,20}
             

The list of expressions in project can also contain ranges of tuple fields, which is very convenient when dealing with wide tables:

    {[#:id1,    #:id2,    #:id3,
      $:name1,  $:name2,  $:name3,
      &:score1, &:score2, &:score3]}: wide_table :=
        {[1, 2, 3, "ala", "ma", "kota", 1.5, 2.5, 3.5]};

    `print project [.id1 .. .id3] wide_table;

    // Output:
    //: #1|#2|#3
    //: --|--|--
    //:  1| 2| 3

    `print project [.name1 ..] wide_table;

    // Output:
    //:   #1 | #2 |  #3  | #4| #5| #6
    //: -----|----|------|---|---|---
    //: "ala"|"ma"|"kota"|1.5|2.5|3.5

    `print project [.. .name3] wide_table;

    // Output:
    //: #1|#2|#3|  #4 | #5 |  #6
    //: --|--|--|-----|----|------
    //:  1| 2| 3|"ala"|"ma"|"kota"

    `print project [..] wide_table;

    // Output:
    //: #1|#2|#3|  #4 | #5 |  #6  | #7| #8| #9
    //: --|--|--|-----|----|------|---|---|---
    //:  1| 2| 3|"ala"|"ma"|"kota"|1.5|2.5|3.5
             

A couple more useful examples of project:

  • Extend points in time to 15-minute intervals:

        project [(|)[.point_in_time, (^)"PT15M"] web_clicks
                     

  • Compute the number of unique respondents in a dataset:

        # ! project [.respondent_id] dataset
                     

  • Compute the first 10 powers of 2:

        project [2**.#1] {1..10}
                     

  • Generate a vector of 1s of length 5:

        project [1] {1..5}
                     

Comparison with SQL.  There is no explicit project operator in SQL. The columns are selected and transformed in the SELECT clause. For example, the "people" query in SQL would look like that[10]:

     SELECT id, name, DATEDIFF(year, date_of_birth, GETDATE()) FROM people
             

It is not possible to refer to columns by numbers in SQL. It is also not possible to address whole ranges of columns.

Selection

Selection comes also directly from relational algebra and is denoted by σ. Selection is a unary operation σΥ(R), where R is a relation and Υ is a boolean expression, whose result is a set of tuples from R for which Υ holds. The select statement in Rax has the following form:

select[ <boolean_expressions_with_tuple_fields> ] <set_expression>

The boolean expression, which we call selection condition can be an arbitrary expression that can (but does not have to) refer to tuple fields in the set expression. A tuple field reference can be in a form of a number, field label or field label and number. For example:

    // Select people born before 1980
    //
    `print select [.#1 == 1 && .date_of_birth.year < 1980] people;

    // Output:
    //: id|       name       |   date_of_birth
    //: --|------------------|-------------------
    //:  1|"Gosia Wrzesinska"|1978-02-08T00:00:00
             

The set expression must evaluate to a set, but not necessarily to a set of tuples. For example:

    {@}: dates := {(@)"2012-11-12", (@)"1978-02-08", (@)"2015-06-12"};

    // Select dates after year 1990
    //
    `print select [.#1.year > 1990] dates;

    // Output: {2012-11-12T00:00:00,2015-06-12T00:00:00}

    // Select dates falling in a period of 25 years
    // starting from the 1st of January 1990
    //
    `print select [.#1 <: (|)"1990-01-01/P25Y"] dates;

    // Output: {2012-11-12T00:00:00}
             

Comparison with SQL.  In SQL, the selection is performed using the SELECT clause. To perform a pure selection (i.e. without any projection), SELECT * should be used. For example the "people" query in SQL would look like this:

    SELECT * FROM people
    WHERE id = 1 AND DATEPART(year, date_of_birth) < 1980
             

Natural join

Natural join is a binary operator between relations: R ⨝ S, where R and S are relations. The result of the natural join is the set of all combinations of tuples in R and S that are equal on their common attribute names. We've already seen an example of a natural join in Chapter 2, Getting started:

    {[$:name, #:dept_id]}: employees := {
      ["Gosia Wrzesinska", 1],
      ["Daniela Gavidia", 1],
      ["Matt Dobson", 2],
      ["Jan-Mark Wams", 3]
    };

    {[#:dept_id, $:dept_name]}: departments := {
      [1, "Software development"],
      [2, "Hardware maintenance"],
      [3, "Entertainment"]
    };

    `print employees |><| departments;

    // Output:
    //:        name       |dept_id|       dept_name
    //: ------------------|-------|----------------------
    //:  "Daniela Gavidia"|   1   |"Software development"
    //: "Gosia Wrzesinska"|   1   |"Software development"
    //:   "Jan-Mark Wams" |   3   |    "Entertainment"
    //:    "Matt Dobson"  |   2   |"Hardware maintenance"
             

In the relational-algebra terminology, we joined relations employees and departments by matching the tuples from both sides for which the value of the dept_id attribute is equal.

Natural join in Rax has the following syntax:

<set_expression> |><| <set_expression>

or

<set_expression> |><| :[ <conditionals> ] <set_expression>

Rax searches for matching tuple fields in both set set expressions (both field name and type must match). Unnamed fields do not match any other fields. If there are no matching fields, the natural join turns into a Cartesian product. For example:

    `print {1,2} |><| {1,2};

    // Output:
    //: #1|#2
    //: --|--
    //:  1| 1
    //:  1| 2
    //:  2| 1
    //:  2| 2
             

You can tell Rax to warn you in such cases, by switching the warning level to 4 (more about warning levels in Chapter 19, Meta instructions and macros):

    %warn 4;
    `print {1,2} |><| {1,2};

    // Error: Rax warning: No matching labels turns |><| into ><
    //
    // Output:
    //: #1|#2
    //: --|--
    //:  1| 1
    //:  1| 2
    //:  2| 1
    //:  2| 2
             

The extra conditions have the following syntax:

:[ <conditions> ]

The extra conditions can be used to select only a part of the tuples from the join result. The boolean expression in the conditions can refer to tuple fields from both set expressions. For example:

    `print employees |><|:[.dept_name == "Software development"] departments;

    // Output:
    //:        name       |dept_id|       dept_name
    //: ------------------|-------|----------------------
    //:  "Daniela Gavidia"|   1   |"Software development"
    //: "Gosia Wrzesinska"|   1   |"Software development"

             

Rax will automatically detect from which set expression the tuple fields in the boolean expression are coming from.

Comparison with SQL.  Not all SQL dialect provide a natural-join operator. For example, in SQL Server, the natural-join operator has to be simulated in this way:

    SELECT * FROM employees JOIN departments
    ON (employees.dept_id = departments.dept_id)
             

Note that, unlike with a natural join, the result of this join will contain the dept_id columns from both sides. In some SQL dialects, this can be shortened to:

    SELECT * FROM employees INNER JOIN departments USING (dept_id)
             

MySQL and PostgreSQL do provide a natural-join operator:

    SELECT * FROM employees NATURAL JOIN departments
             

The result of this statement will contain the dept_id column only once.

Cross join and ϴ-join

Cross join is another name for Cartesian product, which in set theory is defined in the following way:

A × B = {(a,b) | a ∈ A ∧ b ∈ B}

In other words, it is a set of all combinations of elements from sets A and B. In Rax, the cross join is denoted by ><. For example:

    `print {1,2} >< {3,4};

    // Output:
    //: #1|#2
    //: --|--
    //:  1| 3
    //:  1| 4
    //:  2| 3
    //:  2| 4

    `print employees >< departments;

    // Output:
    //:        name       |dept_id|dept_id|       dept_name
    //: ------------------|-------|-------|----------------------
    //:  "Daniela Gavidia"|   1   |   1   |"Software development"
    //:  "Daniela Gavidia"|   1   |   2   |"Hardware maintenance"
    //:  "Daniela Gavidia"|   1   |   3   |    "Entertainment"
    //: "Gosia Wrzesinska"|   1   |   1   |"Software development"
    //: "Gosia Wrzesinska"|   1   |   2   |"Hardware maintenance"
    //: "Gosia Wrzesinska"|   1   |   3   |    "Entertainment"
    //:   "Jan-Mark Wams" |   3   |   1   |"Software development"
    //:   "Jan-Mark Wams" |   3   |   2   |"Hardware maintenance"
    //:   "Jan-Mark Wams" |   3   |   3   |    "Entertainment"
    //:    "Matt Dobson"  |   2   |   1   |"Software development"
    //:    "Matt Dobson"  |   2   |   2   |"Hardware maintenance"
    //:    "Matt Dobson"  |   2   |   3   |    "Entertainment"
             

Like with the natural join, the cross-join operator in Rax can contain extra conditionals:

<set_expression> >< <set_expression>

or

<set_expression> >< :[ <conditionals> ] <set_expression>>

Using the extra conditionals, one can express other types of joins, such as ϴ-join, and its special case - equijoin. For example:

    `print employees ><:[.dept_id#1 == .dept_id#2] departments;

    // Output:
    //        name       |dept_id|dept_id|       dept_name
    // ------------------|-------|-------|----------------------
    //  "Daniela Gavidia"|   1   |   1   |"Software development"
    // "Gosia Wrzesinska"|   1   |   1   |"Software development"
    //   "Jan-Mark Wams" |   3   |   3   |    "Entertainment"
    //    "Matt Dobson"  |   2   |   2   |"Hardware maintenance"
             

A ϴ-join produces all possible combinations of tuples from both relations, for which the ϴ-condition - a comparison between tuple attributes - holds. If the ϴ-condition is an equality condition, we call the join equijoin. In the example above, we joined sets employees and departments on the equality condition .dept_id#1 == .dept_id#2. The extra conditionals are somewhat stronger that the ϴ-condition in the ϴ-join - it can be an arbitrary boolean expression that can (but does not have to) refer to the tuple fields in both sets. Rax will automatically detect from which set expression the tuple fields in the boolean expression are coming from. For duplicate tuple-field labels, use the name+number naming scheme, like in the example above, where .dept_id#1 refers to the tuple from the left side, while .dept_id#2 refers to the tuple from the right side. Note that the result of the join in the example above is almost the same as the result of the natural join in the previous section. The dept_id field is included in the resulting set twice, though.

Comparison with SQL.  Most SQL dialects contain a cross-join operator:

    SELECT * FROM employees CROSS JOIN departments
             

Often, you can also express a cross join in the following way:

    SELECT * FROM employees, departments
             

The SQL standard contains an inner join operator which can be compared to the ϴ-join and equijoin, for example:

    SELECT * FROM employees INNER JOIN departments
    ON (employees.dept_id == departments.dept_id)
             

In most SQL dialects, the INNER keyword can be omitted:

    SELECT * FROM employees JOIN departments
    ON (employees.dept_id == departments.dept_id)
             

Finally, in SQL, you can add arbitrary boolean conditions to a join expression by using the WHERE-clause:

    SELECT * FROM employees JOIN departments
    ON (employees.dept_id == departments.dept_id)
    WHERE employees.dept_id < 2
             

Natural outer joins

Like the natural join described before, natural outer joins match tuples based on tuple fields that are common for both sets (same label and type). Unlike the natural join though, natural outer joins do not require all tuples from both relations to match. The result of an outer join retains each tuple from the input sets, even if no matching record exists. In this case, the result will contain a tuple formed by extending the unmatched tuple with "filler" values. In Rax, every type has a default value which is used as a filler in outer joins. The default values for Rax types are listed in Table 12.1, “Initial Missing Values for Rax Types”, in Chapter 10, More about types.

There are three types of (natural) outer joins:

  • Left outer join in which all tuples from the left set are retained, but only matching tuples from the right set. The filler values will only be used on the right side. Left outer join in Rax is denoted by =><|.

  • Right outer join in which all tuples from the right set are retained but only matching tuples from the left set. The filler values will only be used on the left side. Right outer join in Rax is denoted by |><=.

  • Full outer join in which all tuples from both sets will be retained. The filler values will be used on both sides. Full outer join in Rax is denoted by =><=

Below a couple of examples illustrating all types of outer joins:

    {[$:city, $:province]}: Province := {
      ["Delft","ZH"],
      ["Leek","GR"],
      ["Rotterdam","ZH"]
    };

    {[$:city, $:area]}: Area := {
      ["Delft","Zuid"],
      ["Rotterdam","Zuid"],
      ["Sneek","Noord"]
    };

    `print Province  =><| Area;

    // Output:
    //     city   |province| area
    // -----------|--------|------
    //   "Delft"  |  "ZH"  |"Zuid"
    //    "Leek"  |  "GR"  |  ""
    // "Rotterdam"|  "ZH"  |"Zuid"
             

In the above example, a left outer join is performed between tables Province and Area. The tuples are matched on their common field city. There is no matching tuple for ["Leek","GR"], therefore this tuple is filled up with a default value in the field area (in this case, the default value is the empty string, ""). Note that not all tuples from the right side of the join are in the result set - the tuple ["Sneek","Noord"], for which there is no match, was removed.

    `print Province |><= Area;

    // Output:
    //     city   |province|  area
    // -----------|--------|-------
    //   "Delft"  |  "ZH"  | "Zuid"
    // "Rotterdam"|  "ZH"  | "Zuid"
    //   "Sneek"  |   ""   |"Noord"
    //
             

In the above example, a right outer join is performed between tables Province and Area. This time, the tuple ["Sneek","Noord"] from the right side of the join, for which there is no match on the left side, is retained and filled up with default value in the province field. The ["Leek","GR"], for which there is no match on the right side was removed.

    `print Province =><= Area;

    // Output:
    //     city   |province|  area
    // -----------|--------|-------
    //   "Delft"  |  "ZH"  | "Zuid"
    //    "Leek"  |  "GR"  |   ""
    // "Rotterdam"|  "ZH"  | "Zuid"
    //   "Sneek"  |   ""   |"Noord"
             

Finally, in the full outer join, both tuples for which there is no match on the other side, ["Leek","GR"] and ["Sneek","Noord"], are retained and filled with default values.

The formal syntax of the outer joins in Rax is, for left outer join:

<set_expression> =><| <set_expression>

or

<set_expression> =><| :[ <conditionals> ] <set_expression>

For right outer join:

<set_expression> |><= <set_expression>

or

<set_expression> |><= :[ <conditionals> ] <set_expression>

For full outer join:

<set_expression> =><= <set_expression>

or

<set_expression> =><= :[ <conditionals> ] <set_expression>

As with other types of join, the extra conditionals can contain an arbitrary boolean expression referencing the tuple fields from both sets. The extra conditionals can be used to further restrict the result set of a join. For example:

    `print Province =><| :[.province == "GR"] Area;

    // Output:
    //     city   |province|area
    // -----------|--------|----
    //    "Leek"  |  "GR"  | ""
             

Comparison with SQL.  Most SQL dialects provide some form of outer-join operators. These are not natural join operators, though, which means the join predicate has to be explicitly specified. Here are some examples:

    SELECT * FROM Province RIGHT OUTER JOIN Area
    ON (Province.city = Area.city)

    SELECT * FROM Province LEFT  OUTER JOIN Area
    ON (Province.city = Area.city)

    SELECT * FROM Province FULL  OUTER JOIN Area
    ON (Province.city = Area.city)
             

The most important difference, however, between outer joins in Rax and in SQL are the "filler" values. In SQL, the non-matching tuples are filled with the special NULL value. NULL is a special value in SQL, because it is by definition not in the domain of any of the SQL datatypes, and therefore it's not comparable with "regular" values and cannot be used in many functions and expressions that only accept "regular" values. The special handling that NULL requires often leads to extra SQL code. For example:

    SELECT * FROM Province LEFT OUTER JOIN Area
    ON (Province.city = Area.city)
    WHERE province <> "LB"
    OR ISNULL(province)
             

The last line in the above example is needed, because NULL is not comparable with "LB". In Rax, such extra code is not necessary:

    `print Province =><| :[.province != "LB"] Area;
             

Fold (aggregation or reduction)

Aggregation in relational algebra allows to compute various functions on columns, like summing up their elements, finding the minimum, etc. Relational algebra contains, at least, five such aggregate functions: GSum(), GCount(), GAverage(), GMax() and GMin(). The functions can be performed on the whole columns, or on parts of columns obtained by grouping the rows in the table on the values of the grouping attributes.

In Rax, aggregations are performed using the fold function that applies reduction operators. The syntax of fold resembles the syntax of project:

fold[ <list_of_expressions> ] <set_expression>

Like with project, the list of expressions can contain arbitrary expressions referring to the tuple fields. Additionally, list of expressions can contain aggregate functions on tuple fields. For example:

    {[#:employee_id, #:dept_id, &:salary, &:bonus]}:
      Salaries := {
        [34, 842, 23423.55,  100.00],
        [82, 783, 13546.47,   10.00],
        [89, 783, 48727.45,    0.00],
        [11, 842, 36144.88, 1000.00],
        [83, 783, 64001.67, 1000.00]
      };

    `print fold[.dept_id, /sum(.salary)] Salaries;

    // Output:
    //: dept_id|  salary
    //: -------|---------
    //:   783  |126275.59
    //:   842  | 59568.43
             

In this example, /sum is an aggregate function that sums all values. The tuple fields that are used outside of aggregate functions are used as grouping attributes. In this example, dept_id is the grouping attribute. So the fold expression will compute the sum of salaries per department. As with the project operator, expressions inside a fold operator can be labeled. Aggregates can be labeled on the outside and on the inside. Inside labeling can be used to give every individual aggregate expression its own label. Outside labeling is a short hand for labeling all the aggregate expressions the same. When an outside label as well as inside labels are supplied the outer label is honored and all the inner labels are ignored as demonstrated in the code below:

    `print fold[.dept_id, /sum(.#3):dep_sal] Salaries;

    // Output:
    //: dept_id| dep_sal
    //: -------|---------
    //:   783  |126275.59
    //:   842  | 59568.43

    `print fold[.dept_id, /sum(.#3:ignore):total] Salaries;

    // Output:
    //: dept_id|  total
    //: -------|---------
    //:   783  |126275.59
    //:   842  | 59568.43
             

If more than one expression is supplied to a aggregate like /sum, the reason for having both inside and outside labeling becomes clearer, as demonstrated by the code below:

    `print fold[.dept_id, /sum(.#3:sal,.#4:bon)] Salaries;

    // Output:
    //: dept_id|   sal   | bon
    //: -------|---------|----
    //:   783  |126275.59|1010
    //:   842  | 59568.43|1100

    `print fold[.dept_id, /sum(.#3,.#4):sum] Salaries;

    // Output:
    //: dept_id|   sum   | sum
    //: -------|---------|----
    //:   783  |126275.59|1010
    //:   842  | 59568.43|1100
             

The last fold operator in the example above generates a table with two columns labeled sum. These can be addressed like any tuple or table with repeating labels, by adding an index. For example the second sum can be addressed like so: .sum#2.

The following table lists the aggregate functions that can be used in fold.

Table 8.3. Aggregate functions in fold

SyntaxExplanation
/count() Count tuples in a table/group. For example: fold [/count()] employees will count the rows in the employees table, and fold [.dept_id, /count()] employees will count employees per department.
/count( boolean expression ) Count rows for which the boolean expression is true. For example: fold [/count(.dept_id == 1)] employees will count the number of employees in the department with ID 1.
/count!( field1,...,fieldn ) Count rows that are unique on the set of fields field1,...,fieldn. For example: fold [.dept_id, /count!(.first_name, .last_name)] employees will count the number of unique first name / last name combination in each department.
/sum( field1,...,fieldn ) Return the sum of the values in each field field1,...,fieldn. Can be applied only to fields of type # or &.
/cat( field1,...,fieldn ) Return the string concatenation of the values in each field field1,...,fieldn. Can be applied only to fields of type $. Fields as well as the entire /cat() operator take a separator in the form of ||separator: /cat(.field||"separator")
/product( field1,...,fieldn )
or
/*( field1,...,fieldn )
Return the product of the values in each field field1,...,fieldn. Can be applied only to fields of type # or &.
/min( field1,...,fieldn ) Return the minimum for each field field1,...,fieldn. Can be applied to fields of type #, & and $.
/max( field1,...,fieldn ) Return the maximum for each field field1,...,fieldn. Can be applied to fields of type #, & and $.
/median( field1,...,fieldn ) Return the median for each field field1,...,fieldn. Can be applied to fields of type #, & and $.
/average( field1,...,fieldn ) Return the average (mean) for each field field1,...,fieldn. Can be applied to fields of type # and &.
/&&( field1,...,fieldn ) Return the logical AND of all values for each field field1,...,fieldn. Can be applied to fields of type ?.
/||( field1,...,fieldn ) Return the logical OR of all values for each field field1,...,fieldn. Can be applied to fields of type ?.
/^^( field1,...,fieldn ) Return the logical XOR of all values for each field field1,...,fieldn. Can be applied to fields of type ?.
/&( field1,...,fieldn ) Return the bitwise AND of all values for each field field1,...,fieldn. Can be applied to fields of type #.
/|( field1,...,fieldn ) Return the bitwise OR of all values for each field field1,...,fieldn. Can be applied to fields of type #.
/^( field1,...,fieldn ) Return the bitwise XOR of all values for each field field1,...,fieldn. Can be applied to fields of type #.


Comparison with SQL.  In SQL, aggregate functions can be used in the SELECT clause mixed with other expressions on columns. Unlike in Rax, the grouping columns have to be specified explicitly in the GROUP BY clause. Non-grouping columns cannot be used outside aggregate functions. For example:

    SELECT dept_id, SUM(salary)
    FROM employee_salaries
    GROUP BY dept_id
               

All SQL dialects support the five basic aggregate functions: COUNT/COUNT DISTINCT, SUM, AVG, MIN and MAX. Some SQL dialects offer additional functions. Note, that SQL's aggregate functions need to be well defined on unordered sets (since SQL tables are unordered), whereas, since all Rax sets are ordered, the Rax fold operator can perform operations like string concatenation where the order is important.

Partitioning

Informally, partitioning is splitting a set into subsets based on certain properties of the elements. More formally, a partition of a set S is a set of sets P where:

  1. P does not contain the empty set.

  2. The union of all sets in P is equal to S.

  3. The sets in P are pairwise disjoint.

In Rax, sets can be partitioned using the partition operator, with the following syntax:

partition[ <list_of_tuple_fields> ] <set_expression>

The set defined by the 'set_expression' will be partitioned in such a way, that values of the tuple fields on the 'list_of_tuple_fields' are equal in each subset. For example:

    `print partition [.dept_id] employees;

    // Output:
    // {
    //   {["Daniela Gavidia", 1], ["Gosia Wrzesinska", 1]},
    //   {["Matt Dobson", 2]},
    //   {["Jan-Mark Wams", 3]}
    // }
             

In this example, we partitioned the set of employees on their department. The result is a set of 3 sets: one per each department. Each set contains employees working in the given department.

Comparison with SQL.  Since in SQL the result of every query has to be a table, and a partition is a set of tables, there is no explicit partitioning operator. SQL's window functions provide a way of specifying table partitioning. The window function is applied to each subset separately and the subsets are unified to provide the end result, for example:

    SELECT employee_id, dept_id, avg(salary) OVER (PARTITION BY dept_id)
    FROM employee_salaries;
             



[10] We're using the SQL Server dialect of SQL - date and time functions are different in different SQL dialects