Beautiful Perl feature: BLOCKs
Source: Dev.to
Beautiful Perl series
This post is part of the beautiful Perl features series – an introductory post that gives a general overview of the series.
The BLOCK Construct
Today’s topic is the BLOCK construct in the Perl documentation: a sequence of statements enclosed in curly brackets {}.
The concept and syntax are common to many programming languages – they also appear in C‑heritage languages such as Java, JavaScript, and C++ – but Perl differs in several important ways. Read on to dig into the details.
Where BLOCKs Can Be Used
- As part of a compound statement – after an initial control‑flow construct like
if,while,foreach, etc. - As the body of a sub declaration (
subroutineorfunction). - Where a single statement is expected – you can insert a plain sequence of statements, but enclosing them in a BLOCK creates a new delimited lexical scope so that the effect of inner declarations ends when control flow exits the BLOCK. (Details about lexical scopes are discussed below.)
- As part of a
doexpression – the whole BLOCK becomes a value that can be inserted into a more complex expression. This can improve clarity and avoid a subroutine call when efficiency matters.
All modern programming languages have constructs equivalent to usages 1 and 2, because they are crucial for structuring algorithms and handling complexity. Usage 3 is less common, and usage 4 is quite particular to Perl. The next sections will cover these aspects in more depth.
Lexical Scopes Created by BLOCKs
In every usage listed above, a Perl BLOCK always opens a new lexical scope – a portion of code that delimits the effect of inner declarations. Things that can be temporarily declared inside a lexical scope are:
| Kind | Description |
|---|---|
| Lexical variables | Declared with my or state; temporarily bind a name to a memory location on the stack. |
| Lexical pragmata | Imported semantics for the current BLOCK; introduced with use or no. |
| Lexical subroutines | Available since Perl 5.18; only accessible within the scope and declared with my or state. |
When a BLOCK is used as part of a compound statement (if, foreach, etc.), the initial clause before the BLOCK is already part of the lexical scope, so variables declared in that clause can be used inside the BLOCK:
foreach my $member (@list) {
work_with($member); # $member is usable here
}
say $member; # ERROR: $member is no longer in scope
The same holds when a compound statement has several clauses (e.g., if … elsif … else …). Further examples, together with detailed explanations, can be found in perlsub.
Common Facts About Declarations Inside Lexical Scopes
- Declarations take effect starting from the statement after the declaration. They may appear anywhere in the BLOCK; the common practice is to place them at the beginning, but it is not required.
- They may temporarily shadow declarations in higher scopes.
- Their effect ends when control flow exits the BLOCK, regardless of how the exit occurs (
return,next,goto, an exception, etc.).
Declarations in lexical scopes have effects both at compile time (the interpreter temporarily alters its parsing rules) and at run time (the interpreter temporarily allocates or releases resources). For example:
{
my $db_handle = DBI->connect(@db_connection_args);
my @large_array = $db_handle->selectall_array($some_sql, @some_bind_values);
open my $file_handle, '>', $output_file
or die "could not open $output_file: $!";
print $file_handle formatted_output($_) foreach @large_array;
}
- Compile‑time: the interpreter knows that
$db_handle,@large_array, and$file_handleare allowed inside the BLOCK but not outside it, enabling static checks for typos or misuse. - Run‑time: the interpreter dynamically allocates the database handle, the array, and the file handle, and releases them automatically when control flow leaves the BLOCK.
This behaviour is similar to what happens in a statically‑typed language like Java. By contrast, Python – often grouped with Perl because both are dynamically typed – does not treat lexical scopes the same way.
Perl BLOCKs vs. Python suites
In Python there is no generally available construct as versatile as a Perl BLOCK.
Sequences of statements are expressed through indentation, but this is only allowed as part of a function definition or a compound statement. A compound statement must start with a keyword (if, for, while, etc.) that opens the header clause and is followed by a suite:
if is_success():
summary = summarize_results()
report_to_user(summary)
cleanup_resources()
A suite in Python is not to be confused with a block:
| Python concept | Definition (official docs) |
|---|---|
| Block | “A piece of Python program text that is executed as a unit” – occurs only within a module, a function body, or a class definition. |
| Suite | “A group of statements controlled by a clause” – occurs whenever a clause in a compound statement expects to be followed by some instructions. |
Both look similar because they are expressed as indented sequences of statements, but the crucial difference is that a block opens a new lexical scope, while a suite does not. Consequently, variables declared in a compound statement remain available after the statement has ended – a behaviour that can be surprising for programmers coming from Perl.
Lexical vs. Dynamic Scoping
Grammars coming from a C‑like culture (including Perl and Java) often behave differently from Python when it comes to the scope of loop variables.
for i in [1, 2, 3]:
pass
print(i) # → 3
The snippet above is valid Python code and prints 3.
See the excellent explanation at [link] for why Python works differently from many other languages.
In addition to traditional lexical scoping, Perl also has a construct named dynamic scoping, introduced through the keyword local.
Dynamic scoping is a vestige from Perl 1, but it is still useful in some specific use‑cases; it will be discussed in a future post in this series.
For the moment, let us just say that in all common situations lexical scoping is the most appropriate mechanism for working with variables guaranteed not to interfere with the global state of the program.
Lexical Variables in Perl
Lexical variables are introduced with the my keyword. Several variables—possibly of different types—can be declared and initialized in a single statement:
my @table = ([qw/x y z/], [1, 2, 3], [9, 8, 7]);
my ($nb_rows, $nb_cols) = (scalar @table, scalar $table[0]->@*);
my (@db_connection_args, %select_args);
Declaration Rules
- Lexical variables can only be used starting from the statement after the declaration.
- Therefore the following is illegal in Perl because
$xis not yet in scope when it appears on the right‑hand side:
my ($x, $y, $z) = (123, $x + 1, $x + 2); # ❌ illegal
By contrast, languages that evaluate the right‑hand side sequentially (e.g., Java, JavaScript) accept similar code:
int x = 123, y = x + 1, z = x + 2; // Java
let x = 123, y = x + 1, z = x + 2; // JavaScript
Python behaves like Perl: it does not accept x, y = 123, x + 1 because x is undefined on the right‑hand side.
Both Perl and Python have had “destructuring” (list unpacking) from the start; other languages adopted similar features later.
Destructuring in Other Languages
| Language | Syntax | Notes |
|---|---|---|
| JavaScript (ES6, 2015) | let [a, b, c] = [123, 124, 125]; | Works for arrays and objects. |
| Java (Amber project) | Pattern‑matching (records only, not lists yet) | Still experimental. |
// Sequential assignments
let x = 123, y = x + 1, z = x + 2;
// List destructuring
let [a, b, c] = [123, 124, 125];
Common Perl List‑Destructuring Idioms
Extracting items from command‑line arguments
my ($user, $password, @others) = @ARGV;
Extracting items from a subroutine’s argument list
my ($height, $width, $depth) = @_;
Swapping variables
($x, $y) = ($y, $x);
Variable Shadowing
A lexical variable in Perl can shadow another variable of the same name in an outer lexical scope. The shadowing effect starts after the declaration statement, so the outer variable can still be used in the initializer:
my $x = 987;
{
my ($x, $y) = (123, $x + 1);
say "inner scope, x is $x and y is $y"; # → inner scope, x is 123 and y is 988
}
say "outer scope, x is $x"; # → outer scope, x is 987
How Other Dynamically‑Typed Languages Handle Shadowing
Python
Python has no explicit variable declarations; any assignment implicitly creates a lexical variable in the current scope.
def foo():
x = 123 # declares lexical variable x
y = 456 # declares lexical variable y
x = 789 # re‑assigns x
Because declaration is implicit, the interpreter cannot help detect typographical errors that would be caught in languages with explicit declarations. For example, the following might be a typo (perhaps z was intended instead of re‑assigning x).
UnboundLocalError Example
If an assignment appears anywhere in a function, the name is treated as a lexical variable for the entire function body, even before the assignment line. This can surprise newcomers:
x = 10
def foo():
print(x) # ← raises UnboundLocalError
x += 1
foo()
The x += 1 makes x a local variable for the whole function, so the print(x) tries to read an uninitialized local variable.
If the assignment is removed (or commented out), the code works as expected:
x = 10
def foo():
print(x) # prints 10
# x += 1
foo()
global and nonlocal
Python provides the statements global and nonlocal to override the default lexical‑variable creation:
global name– tells the parser thatnamerefers to the module‑level (global) variable.nonlocal name– tells the parser thatnamerefers to the nearest enclosing (but not global) scope.
These declarations apply to the entire current lexical scope, regardless of where they appear, making the rule exclusive: a name must be either
- a lexical variable of the current scope, or
- a variable from an outer (non‑global) scope (
nonlocal), or - a global variable (
global).
Thus, unlike Perl or Java where you declare lexical variables, in Python you declare the variables that are not lexical.
Variable Shadowing and Lexical Scoping in Different Languages
JavaScript
The historical construct for declaring lexical variables in JavaScript was the var keyword, which is still present in the language.
varbehaves similarly to Python’s lexical variables:- Variables appear to exist before they are declared (called hoisting).
- They are scoped by functions or modules, not by blocks, so they retain their values after exiting a block.
- The interpreter does not complain if a variable is declared twice.
Because of these quirks, var is now considered deprecated and has been replaced (since ES6 / 2015) by the keywords
const– for variables that never change after initialization, andlet– for mutable variables.
These newer constructs introduced more safety:
- A
let/constvariable cannot be used after exiting its block. - Redeclarations raise syntax errors.
Temporal Dead Zone
One ambiguity remains: the shadowing effect of a variable declared with let does not start at the location of the declaration, but at the beginning of the enclosing block.
This is no longer called hoisting, but it still means that from the start of the block the new name shadows any variable with the same name in outer scopes.
In JavaScript literature this is known as the temporal dead zone.
Java
Java has no ambiguity with shadowing because it takes a more radical approach: it raises a compile‑time error when a variable is declared in an inner block with a name already used in an outer scope.
public class ScopeDemo {
public static void main(String[] args) {
int x = 987;
{
int x = 123, y = x + 1, z = x + 2;
System.out.println("here x is " + x + " and y is " + y);
}
System.out.println("here x is " + x);
}
}
Compilation result
ScopeDemo.java:6: error: variable x is already defined in method main(String[])
int x = 123, y = x + 1, z = x + 2;
^
1 error
error: compilation failed
Perl – Lexical Pragmata
In Perl, lexical scopes are used not only to control the lifetime of lexical variables but also to control lexical pragmata that temporarily alter interpreter behaviour.
Pragmata can add semantics (use) or remove semantics (no).
Example 1 – Suppressing Warnings Locally
use strict;
use warnings;
foreach my $user (get_users_from_database()) {
no warnings 'uninitialized';
my $body = "Dear $user->{firstname} $user->{lastname}, bla bla bla";
...
}
warningsis enabled globally (good practice).- Inside the loop we temporarily disable the uninitialized warning because some database fields may be
undef, which is acceptable in that context.
Example 2 – Disabling strict 'refs' for Symbolic References
foreach my $method_name (@list_of_names) {
no strict 'refs';
*{$method_name} = generate_closure_for($method_name);
}
strict 'refs'prevents programmatic insertion of new subroutines into a module’s symbol table.- When generating methods dynamically (e.g., in
DBIx::DataModel), we temporarily lift that restriction.
Example 3 – Reinforcing Controls: Autovivification
my $tree; # $tree is undef
$tree->{foo}{bar}[1] = 99; # autovivifies; now $tree is { foo => { bar => [undef, 99] } }
To disable autovivification:
{
no autovivification qw/fetch store/;
my $tree; # $tree is undef
$tree->{foo}{bar}[1] = 99; # ERROR: Can't vivify reference
}
- The
autovivificationmodule changes Perl’s default behaviour of creating intermediate references on‑the‑fly. - Disabling it can make code safer when accidental structure creation would be a bug.
Nesting Lexical Pragmata
Like lexical variables, lexical pragmata can be nested; the innermost use/no temporarily shadows previous declarations of the same pragma.
Other examples of lexical pragmata include:
bigint– transparently performs all arithmetic withMath::BigInt.Regexp::Grammars– adds grammatical parsing features to Perl regexes.
The perlpragma documentation explains how module authors can implement new lexical pragmata.
do BLOCK – Inserting a Block Anywhere in an Expression
Perl’s do BLOCK construct lets you place a block anywhere in an expression.
The value of the last statement in the block becomes the value of the whole do expression, which can then be used by surrounding operators.
Example 1 – Cheap XML Entity Encoding
my %ENTITY_TABLE = ( '<' => '<', '>' => '>', '&' => '&' );
my $entity_regex = do {
my $chars = join "", keys %ENTITY_TABLE;
qr/[$chars]/
};
# later …
$text =~ s/($entity_regex)/$ENTITY_TABLE{$1}/g; # encode entity characters in $text
Example 2 – Conditional Record Parsing (from Excel::ValueReader::XLSX)
my $row = $args{want_records}
? do {
# return a list of hashrefs (column => value)
...
}
: do {
# return a list of arrayrefs (plain values)
...
};
The do BLOCK form keeps the surrounding expression tidy while allowing a small, self‑contained computation.
All code snippets are presented exactly as in the original text; only formatting has been improved for readability.
{ my %r; @r{ @{$args{columns}} } = @$vals; \%r }
: $vals;
If the caller wants records, the do block performs a hash‑slice assignment into a lexical hash variable to create a new record on the fly.
Thanks to BLOCKs, lexical scoping can be introduced very flexibly almost anywhere in Perl code. The semantics of lexical variables and lexical pragmata cleanly define that the lexical effect starts.
At the next statement after the declaration and ends when the block exits, without the surprises seen in some other languages.
The shadowing effect of lexical variables in inner scopes is easily understandable and consistent across all higher scopes, including the next enclosing lexical scopes and the global module scope.
*What a beautiful language design!*
The next post will be about **dynamic scoping** through the `local` keyword – another, complementary way for temporarily changing the behaviour of the interpreter.
> The picture is an excerpt from the initial movement of Verdi's *Requiem*, at a place where Verdi shadows several characteristics of the movement: for a short while the orchestra stays still, leaving the choir a cappella, with a different speed, different tonality, and different dynamics; then after this parenthesis, all parameters come back to their initial state, *come prima*, as stated in the score.