Reducing data
Here we demonstrate how to ask a few questions with "scalar" answers - like "Does the table contain x?", or "What is the average value of y?"
Testing containment
One of the most basic questions to ask is: "Is this element in the table/column?". Julia's in
operator is perfect for this.
julia> t = Table(name = ["Alice", "Bob", "Charlie"], age = [25, 42, 37])
Table with 2 columns and 3 rows:
name age
┌─────────────
1 │ Alice 25
2 │ Bob 42
3 │ Charlie 37
julia> in("Alice", t.name)
true
julia> in("Debbie", t.name)
false
The in
function can also be used as an infix operator, as in "Alice" in t.name
or "Alice" ∈ t.name
.
"How many?"
The count
method is useful for asking how many rows satisfy a certain criterion.
julia> count(row -> row.age > 40, t)
1
Totals, averages, etc.
Individual columns can be reduced in the typical way for Julia arrays. Some examples.
julia> sum(t.age)
104
julia> using Statistics
julia> mean(t.age)
34.666666666666664
julia> median(t.age)
37.0
julia> join(t.name, ", ", " and ")
"Alice, Bob and Charlie"
Note that join
is a string joining function; see innerjoin
(from SplitApplyCombine) for the relational operation.
It's just as easy to calculate multi-column statistics by reducing over the entire table.
julia> mapreduce(row -> length(row.name) * row.age, +, t)
510