This document introduces the basic syntax and examples of estimation functions.
Function | Syntax | Description |
---|---|---|
approx_distinct | approx_distinct(x) | Returns the approximate number of distinct input values (column x). |
approx_percentile | approx_percentile(x,percentage) | Sorts the values in the x column in ascending order and returns the value approximately at the given `percentage` position. |
approx_percentile(x,array[percentage01, percentage02...]) | Sorts the values in the x column in ascending order and returns the values approximately at the given `percentage` positions (percentage01, percentage02...). |
The approx_distinct
function is used to get the approximate number of distinct input values of a field.
approx_distinct(x)
Parameter | Description |
---|---|
x | The parameter value can be of any type. |
bigint
Use the count
function to calculate the PV value and use the approx_distinct
function to get the approximate number of distinct input values of the client_ip
field and use it as the UV value.
* | SELECT count(*) AS PV, approx_distinct(ip) AS UV
The approx_percentile
function is used to sort the values of a target field in ascending order and return the values approximately at the given percentage
position.
percentage
positionapprox_percentile(x, percentage)
percentage
positions (percentage01,percentage02...)approx_percentile(x, array[percentage01,percentage02...])
Parameter | Description |
---|---|
x | Value type: double |
percentage | Value range: [0,1] |
double or array
Example 1: sort the values of the resTotalTime column and return the value of resTotalTime approximately at the 50% position
* | select approx_percentile(resTotalTime,0.5)
Example 2: sort the values of the resTotalTime column and return the values of resTotalTime approximately at the 10%, 20%, and 60% positions
* | select approx_percentile(resTotalTime, array[0.2,0.4,0.6])
Was this page helpful?