Skip to content

Commit

Permalink
Port / Add Documentation for VarianceSample and VariancePopulation (
Browse files Browse the repository at this point in the history
  • Loading branch information
alamb authored Oct 7, 2024
1 parent 84c9409 commit b3bf3af
Show file tree
Hide file tree
Showing 3 changed files with 125 additions and 45 deletions.
47 changes: 44 additions & 3 deletions datafusion/functions-aggregate/src/variance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,24 @@
//! [`VarianceSample`]: variance sample aggregations.
//! [`VariancePopulation`]: variance population aggregations.

use std::{fmt::Debug, sync::Arc};

use arrow::{
array::{Array, ArrayRef, BooleanArray, Float64Array, UInt64Array},
buffer::NullBuffer,
compute::kernels::cast,
datatypes::{DataType, Field},
};
use std::sync::OnceLock;
use std::{fmt::Debug, sync::Arc};

use datafusion_common::{
downcast_value, not_impl_err, plan_err, DataFusionError, Result, ScalarValue,
};
use datafusion_expr::aggregate_doc_sections::DOC_SECTION_GENERAL;
use datafusion_expr::{
function::{AccumulatorArgs, StateFieldsArgs},
utils::format_state_name,
Accumulator, AggregateUDFImpl, GroupsAccumulator, Signature, Volatility,
Accumulator, AggregateUDFImpl, Documentation, GroupsAccumulator, Signature,
Volatility,
};
use datafusion_functions_aggregate_common::{
aggregate::groups_accumulator::accumulate::accumulate, stats::StatsType,
Expand Down Expand Up @@ -135,6 +137,26 @@ impl AggregateUDFImpl for VarianceSample {
) -> Result<Box<dyn GroupsAccumulator>> {
Ok(Box::new(VarianceGroupsAccumulator::new(StatsType::Sample)))
}

fn documentation(&self) -> Option<&Documentation> {
Some(get_variance_sample_doc())
}
}

static VARIANCE_SAMPLE_DOC: OnceLock<Documentation> = OnceLock::new();

fn get_variance_sample_doc() -> &'static Documentation {
VARIANCE_SAMPLE_DOC.get_or_init(|| {
Documentation::builder()
.with_doc_section(DOC_SECTION_GENERAL)
.with_description(
"Returns the statistical sample variance of a set of numbers.",
)
.with_syntax_example("var(expression)")
.with_standard_argument("expression", "Numeric")
.build()
.unwrap()
})
}

pub struct VariancePopulation {
Expand Down Expand Up @@ -222,6 +244,25 @@ impl AggregateUDFImpl for VariancePopulation {
StatsType::Population,
)))
}
fn documentation(&self) -> Option<&Documentation> {
Some(get_variance_population_doc())
}
}

static VARIANCE_POPULATION_DOC: OnceLock<Documentation> = OnceLock::new();

fn get_variance_population_doc() -> &'static Documentation {
VARIANCE_POPULATION_DOC.get_or_init(|| {
Documentation::builder()
.with_doc_section(DOC_SECTION_GENERAL)
.with_description(
"Returns the statistical population variance of a set of numbers.",
)
.with_syntax_example("var_pop(expression)")
.with_standard_argument("expression", "Numeric")
.build()
.unwrap()
})
}

/// An accumulator to compute variance
Expand Down
42 changes: 0 additions & 42 deletions docs/source/user-guide/sql/aggregate_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,9 +240,6 @@ last_value(expression [ORDER BY expression])
- [stddev](#stddev)
- [stddev_pop](#stddev_pop)
- [stddev_samp](#stddev_samp)
- [var](#var)
- [var_pop](#var_pop)
- [var_samp](#var_samp)
- [regr_avgx](#regr_avgx)
- [regr_avgy](#regr_avgy)
- [regr_count](#regr_count)
Expand Down Expand Up @@ -349,45 +346,6 @@ stddev_samp(expression)

#### Arguments

- **expression**: Expression to operate on.
Can be a constant, column, or function, and any combination of arithmetic operators.

### `var`

Returns the statistical variance of a set of numbers.

```
var(expression)
```

#### Arguments

- **expression**: Expression to operate on.
Can be a constant, column, or function, and any combination of arithmetic operators.

### `var_pop`

Returns the statistical population variance of a set of numbers.

```
var_pop(expression)
```

#### Arguments

- **expression**: Expression to operate on.
Can be a constant, column, or function, and any combination of arithmetic operators.

### `var_samp`

Returns the statistical sample variance of a set of numbers.

```
var_samp(expression)
```

#### Arguments

- **expression**: Expression to operate on.
Can be a constant, column, or function, and any combination of arithmetic operators.

Expand Down
81 changes: 81 additions & 0 deletions docs/source/user-guide/sql/aggregate_functions_new.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@ Aggregate functions operate on a set of values to compute a single result.
- [bit_and](#bit_and)
- [bit_or](#bit_or)
- [bit_xor](#bit_xor)
- [var](#var)
- [var_pop](#var_pop)
- [var_population](#var_population)
- [var_samp](#var_samp)
- [var_sample](#var_sample)

### `bit_and`

Expand Down Expand Up @@ -72,3 +77,79 @@ bit_xor(expression)
#### Arguments

- **expression**: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

### `var`

Returns the statistical sample variance of a set of numbers.

```
var(expression)
```

#### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

#### Aliases- var_sample

- var_samp

### `var_pop`

Returns the statistical population variance of a set of numbers.

```
var_pop(expression)
```

#### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

#### Aliases- var_population

### `var_pop`

Returns the statistical population variance of a set of numbers.

```
var_pop(expression)
```

#### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

#### Aliases- var_population

### `var`

Returns the statistical sample variance of a set of numbers.

```
var(expression)
```

#### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

#### Aliases- var_sample

- var_samp

### `var`

Returns the statistical sample variance of a set of numbers.

```
var(expression)
```

#### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

#### Aliases- var_sample

- var_samp

0 comments on commit b3bf3af

Please sign in to comment.