Summary -

In this topic, we described about the below sections -

There are two types of built-in functions available in Hive.

  1. General built-in functions
  2. Aggregate built-in functions

General built-in functions -

The below are the list of General built in functions -

Return TypeFunction Name (Signature)Description
BIGINT round(double a) returns the rounded BIGINT value of the double
BIGINT floor(double a) returns the maximum BIGINT value that is equal or less than the double
BIGINT ceil(double a) returns the minimum BIGINT value that is equal or greater than the double
double rand(), rand(int seed) returns a random number that changes from row to row.
string concat(string A, string B,...) returns the string resulting from concatenating B after A.
string substr(string A, int start) returns the substring of A starting from start position till the end of string A.
string substr(string A, int start, int length) returns the substring of A starting from start position with the given length
string upper(string A) returns the string resulting from converting all characters of A to upper case
string ucase(string A) Same as upper
string lower(string A) returns the string resulting from converting all characters of B to lower case
string lcase(string A) Same as lower
string trim(string A) returns the string resulting from trimming spaces from both ends of A
string ltrim(string A) returns the string resulting from trimming spaces from the beginning(left hand side) of A.
string rtrim(string A) returns the string resulting from trimming spaces from the end(right hand side) of A.
string regexp_replace(string A, string B, string C) returns the string resulting from replacing all substrings in B that match the Java regular expression syntax with C.
int size(Map) returns the number of elements in the map type
int size(Array) returns the number of elements in the array type
value of cast( as ) "converts the results of the expression expr to . A null is returned if the conversion does not succeed."
string from_unixtime(int unixtime) convert the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00"
string to_date(string timestamp) Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01"
int year(string date) Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970
int month(string date) Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11
int day(string date) Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1
string get_json_object(string json_string, string path) "Extract json object from a json string based on json path specified, and return json string of the extracted json object. It will return null if the input json string is invalid."

Aggregate built-in functions -

The below are the list of aggregate functions.

Return TypeAggregation Function Name (Signature)Description
BIGINT count(*), count(expr), count(DISTINCT expr[, expr_.]) count(*) - Returns the total number of retrieved rows, including rows containing NULL values;
DOUBLE sum(col), sum(DISTINCT col) returns the sum of the elements in the group or the sum of the distinct values of the column in the group
DOUBLE avg(col), avg(DISTINCT col) returns the average of the elements in the group or the average of the distinct values of the column in the group
DOUBLE min(col) returns the minimum value of the column in the group
DOUBLE max(col) returns the maximum value of the column in the group