Chapter 3. Functions and Arguments

To create a function in the PL/R language, use standard R syntax, but without the enclosing braces or function assignment. Instead of myfunc <- function(arguments) { function body }, the body of your PL/R function is just function body

CREATE OR REPLACE FUNCTION funcname (argument-types)
RETURNS return-type AS '
    function body
' LANGUAGE 'plr';
    

The body of the function is simply a piece of R script. When the function is called, the argument values are passed as variables arg1 ... argN to the R script. The result is returned from the R code in the usual way. For example, a function returning the greater of two integer values could be defined as:

CREATE OR REPLACE FUNCTION r_max (integer, integer) RETURNS integer AS '
    if (arg1 > arg2)
       return(arg1)
    else
       return(arg2)
' LANGUAGE 'plr' STRICT;
    

Starting with PostgreSQL 8.0, arguments may be explicitly named when creating a function. If an argument is explicitly named at function creation time, that name will be available to your R script in place of the usual argN variable. For example:

CREATE OR REPLACE FUNCTION sd(vals float8[]) RETURNS float AS '
    sd(vals)
' LANGUAGE 'plr' STRICT;
    

Starting with PostgreSQL 8.4, a PL/R function may be declared to be a WINDOW. In this case, in addition to the usual argN (or named) variables, PL/R automatically creates several other arguments to your function. For each explicit argument, a corresponding variable called farg1 ... fargN is passed to the R script. These contain an R vector of all the values of the related argument for the moving WINDOW frame within the current PARTITION. For example:

CREATE OR REPLACE
FUNCTION r_regr_slope(float8, float8)
RETURNS float8 AS
$BODY$
  slope <- NA
  y <- farg1
  x <- farg2 
  if (fnumrows==9) try (slope <- lm(y ~ x)$coefficients[2])
  return(slope)
$BODY$
LANGUAGE plr WINDOW;
    

In the preceding example, farg1 and farg2 are R vectors containing the current row's data plus that of related rows. The determination as to which rows qualify as related is determined by the frame specification of the query at run time.

The example also illustrates one of two additional autogenerated arguments. fnumrows is the number of rows in the current WINDOW frame. The other (not shown) auto-argument is called prownum. This argument provides the 1-based row offset of the current row in the current PARTITION. See Chapter 9 for more information and a more complete example.

In some of the the definitions above, note the clause STRICT, which saves us from having to think about NULL input values: if a NULL is passed, the function will not be called at all, but will just return a NULL result automatically. In a non-strict function, if the actual value of an argument is NULL, the corresponding argN variable will be set to a NULL R object. For example, suppose that we wanted r_max with one null and one non-null argument to return the non-null argument, rather than NULL:

CREATE OR REPLACE FUNCTION r_max (integer, integer) RETURNS integer AS '
    if (is.null(arg1) && is.null(arg2))
        return(NULL)
    if (is.null(arg1))
        return(arg2)
    if (is.null(arg2))
        return(arg1)
    if (arg1 > arg2)
       return(arg1)
    arg2
' LANGUAGE 'plr';
    

As shown above, to return a NULL value from a PL/R function, return NULL. This can be done whether the function is strict or not.

Composite-type (tuple) arguments are passed to the procedure as R data.frames. The element names of the frame are the attribute names of the composite type. If an attribute in the passed row has the NULL value, it will appear as an "NA" in the frame. Here is an example:

CREATE TABLE emp (name text, age int, salary numeric(10,2));
INSERT INTO emp VALUES ('Joe', 41, 250000.00);
INSERT INTO emp VALUES ('Jim', 25, 120000.00);
INSERT INTO emp VALUES ('Jon', 35, 50000.00);
CREATE OR REPLACE FUNCTION overpaid (emp) RETURNS bool AS '
    if (200000 < arg1$salary) {
        return(TRUE)
    }
    if (arg1$age < 30 && 100000 < arg1$salary) {
        return(TRUE)
    }
    return(FALSE)
' LANGUAGE 'plr';
SELECT name, overpaid(emp) FROM emp;
 name | overpaid
------+----------
 Joe  | t
 Jim  | t
 Jon  | f
(3 rows)

    

There is also support for returning a composite-type result value:

CREATE OR REPLACE FUNCTION get_emps() RETURNS SETOF emp AS '
    names <- c("Joe","Jim","Jon")
    ages <- c(41,25,35)
    salaries <- c(250000,120000,50000)
    df <- data.frame(name = names, age = ages, salary = salaries)
    return(df)
' LANGUAGE 'plr';
select * from get_emps();
 name | age |  salary
------+-----+-----------
 Jim  |  41 | 250000.00
 Joe  |  25 | 120000.00
 Jon  |  35 |  50000.00
(3 rows)

    

An alternative method may be used to create a function in PL/R, if certain criteria are met. First, the function must be a simple call to an existing R function. Second, the function name used for the PL/R function must match that of the R function exactly. If these two criteria are met, the PL/R function may be defined with no body, and the arguments will be passed directly to the R function of the same name. For example:

create or replace function sd(_float8) returns float as '' language 'plr';
select round(sd('{1.23,1.31,1.42,1.27}'::_float8)::numeric,8);
   round    
------------
 0.08180261
(1 row)
    

Tip: Because the function body is passed as an SQL string literal to CREATE FUNCTION, you have to escape single quotes and backslashes within your R source, typically by doubling them.