Syntax Reference#
A reference for the Data Step statements supported by limulus.
For differences from SAS language, see Differences from SAS language.
For Column-Oriented API and Ssession-level helpers such as sort, sql and include, see API Reference and Changelog.
DATA Statement#
data <output1> [<output2> ...];
...
run;
Specifying multiple output destinations creates a multi-output DATA step.
Using an explicit output statement allows row-level routing.
SET Statement#
set <dataset> [<dataset2> ...] [end=<var>] [indsname=<var>];
Option |
Description |
|---|---|
|
Defines a variable that becomes |
|
Stores the current dataset name in a variable |
Listing multiple datasets performs a vertical concatenation.
Interleaving with BY#
set <dataset1> <dataset2> ...;
by <key>;
When by is specified together with multiple datasets, rows from all sources are
merge-sorted by the BY key rather than concatenated in source order.
This matches “interleaving” semantics.
data combined;
set sales2023 sales2024;
by date;
run;
MERGE Statement#
merge <dataset>(in=<var>) [<dataset2>(in=<var>)];
by <key>;
The in= variable is 1 when a row exists in the source table, 0 otherwise.
If by is not specified, just concat.
If duplicate column names exist, an error is raised instead of overwriting.
WHERE Statement#
where <condition>;
Placed after SET/MERGE. Applies a filter at data read time.
Subsetting IF Statement#
if <condition>;
Applies a filter after data is read.
IF / ELSE IF / ELSE Statement#
if <condition> then <statement>;
else if <condition> then <statement>;
else <statement>;
Use DO...END to group multiple statements:
if x > 0 then do;
y = x * 2;
z = 1;
end;
DO / END Statement#
do <var> = <start> to <stop> [by <step>];
...
end;
Used for counter loops as well as conditional blocks (if...then do;).
BY Statement#
by <var> [<var2> ...];
Used in combination with SET/MERGE.
Makes FIRST.<var> / LAST.<var> automatic variables available.
OUTPUT Statement#
output [<dataset>];
Without arguments, writes to all output destinations.
With an argument, writes only to the specified dataset.
STOP Statement#
stop ;
Stops the Data Step processing.
KEEP / DROP Statement#
keep <var1> [<var2> ...];
drop <var1> [<var2> ...];
Specifies variables to include in / exclude from the output dataset.
RENAME Statement#
rename <oldname>=<newname> [<oldname>=<newname> ...];
RETAIN Statement#
retain <var> [<initial>] [<var2> [<initial2>] ...];
Retains the value of a variable across iterations. If the initial value is omitted, defaults to null.
ARRAY Statement#
array <name> <var1> [<var2> ...];
Assigns a name to a group of variables. Index access (name[1]) is supported.
SUM Statement (Cumulative Addition)#
<var> + <expression>;
Performs cumulative addition without RETAIN.
ASSIGN Statement (Assignment)#
<varname> = <expression>;
DELETE Statement#
delete;
Does not output the current row (moves to the next iteration of the PDV loop).
Built-in Functions#
String#
Function |
Description |
Example |
|---|---|---|
|
Substring |
|
|
Convert to uppercase |
|
|
Convert to lowercase |
|
|
Capitalize first letter |
|
|
Remove trailing spaces |
|
|
Remove leading and trailing spaces |
|
|
String length |
|
|
NULL-safe string length |
|
|
Reverse string |
|
|
n-th token |
|
|
Remove specified characters |
|
|
Position of substring |
|
|
Position of substring |
|
|
Word replacement |
|
|
Character translation |
|
|
Word count |
|
|
Concatenate |
|
|
Concatenate with trim |
|
|
Concatenate with trailing trim |
|
|
Concatenate with delimiter |
|
|
Repeat n times |
|
Numeric#
Function |
Description |
Example |
|---|---|---|
|
Absolute value |
|
|
Round (0.5 rounds away from zero) |
|
|
Ceiling |
|
|
Floor |
|
|
Integer part |
|
|
Remainder |
|
|
Maximum |
|
|
Minimum |
|
|
Sum |
|
|
Mean |
|
|
Square root |
|
|
Natural logarithm |
|
|
Exponent |
|
|
Sign |
|
Missing Values / Dates#
Function |
Description |
Example |
|---|---|---|
|
Returns |
|
|
Count of missing values |
|
|
Count of missing values (mixed types) |
|
|
Previous row’s value |
|
|
Month/day/year to date |
|
|
Year from date |
|
|
Date difference |
|
Regular Expressions#
Function |
Description |
Example |
|---|---|---|
|
Regex match position |
|
|
Regex substitution |
|
ARRAY Helpers#
Function |
Description |
Example |
|---|---|---|
|
Number of elements in array |
|
|
Variable name of array element |
|
Custom Functions#
Function |
Description |
Example |
|---|---|---|
|
Next row’s value |
|
|
negative n behaves like |
|
|
Apply a function. |
|
Note:
apply()is not supported by the Rust backend. Whenbackend="auto"(the default), execution automatically falls back to the Python backend wheneverapply()appears in the code. To suppress the fallback and always use the Python backend, setbackend="python"on theSession.