BoxLang 1.14.0 ships with a new dynamic first-class Set type baked directly into the language. Not a wrapper you reach for manually, not a createObject( "java", "java.util.HashSet" ) incantation you paste from a Stack Overflow answer years ago. A real BoxSet with literal syntax, operator overloads, a full functional pipeline, change listeners, JSON serialization, and deep Java interop.
If you have ever deduplicated an array with a loop, compared two collections element by element, or modeled a permission system on top of a struct -- Sets are the tool you were missing. Let's dig in.
Why Sets? The Problem First
Arrays are ordered, indexed, and allow duplicates. Structs are key/value maps. Both are foundational, but neither models one of the most common real-world shapes: a bag of unique things.
Think about:
- A user's assigned roles:
[ "admin", "editor", "admin" ]-- that duplicate is a bug waiting to happen - Tags on a blog post -- order usually doesn't matter, uniqueness does
- Active feature flags -- membership testing is the only operation you need
- Two datasets that need to be compared -- what's new, what's gone, what's shared?
- Need to know all the incoming arguments in exact order and content?
- URL deduplication in a crawler
- A permission intersection: what can this user do on this resource?
Before BoxSet you'd approximate all of these with arrays (slow arrayContains lookups, manual dedup loops) or structs (keys as values, awkward serialization). Both are workarounds. BoxSet is the real answer.
Meet BoxSet
BoxSet is a first-class BoxLang type that wraps java.util.Set with full language integration. Under the hood it selects among three Java backing implementations depending on what you need:
┌─────────────────┬─────────────────────┬────────────────────────────────┐
│ BoxSet Type │ Java Backing │ Characteristic │
├─────────────────┼─────────────────────┼────────────────────────────────┤
│ default / hash │ HashSet │ No ordering, fastest lookups │
│ linked / ordered│ LinkedHashSet │ Preserves insertion order │
│ sorted / tree │ TreeSet │ Natural ascending order always │
└─────────────────┴─────────────────────┴────────────────────────────────┘
Every variant enforces uniqueness automatically. Every variant supports the full member-function API, operator overloads, and functional pipeline. Java Set objects you get from third-party libraries slot right in -- BoxLang wraps them without copying.
Creating Sets: Every Path
BoxLang gives you several ergonomic creation paths depending on your context.
setNew() -- the workhorse
// Empty default (hash) Set -- fastest, no ordering
s = setNew()
// Seed it on creation
s = setNew( values=[ "alpha", "beta", "gamma", "alpha" ] )
s.size() // 3 -- duplicate dropped automatically
// Linked: preserves insertion order
s = setNew( type="linked", values=[ "c", "a", "b" ] )
s.toArray() // ["c", "a", "b"] -- order preserved
// Sorted: natural ascending order, always
s = setNew( type="sorted", values=[ 9, 1, 5, 3 ] )
s.toArray() // [1, 3, 5, 9]
// Case-sensitive: treat "Hello" and "hello" as distinct values
s = setNew( values=[ "Hello", "hello", "HELLO" ], caseSensitive=true )
s.size() // 3
setOf() -- varargs shorthand
When you know your values up front, setOf() is the cleanest expression:
roles = setOf( "admin", "editor", "viewer" )
primes = setOf( 2, 3, 5, 7, 11, 13 )
Duplicates in the argument list are silently dropped -- no error, no fuss.
Literal syntax: set{ ... }
BoxLang 1.14.0 introduces a set literal syntax that reads like the concept itself:
// Inline literal
permissions = set{ "read", "write", "execute" }
// Empty set
empty = set{}
// Spread an array directly into a set literal
extra = [ 4, 5, 6 ]
merged = set{ 1, 2, 3, ...extra }
// Result: {1, 2, 3, 4, 5, 6}
// Spread another set
defaults = set{ "read", "write" }
full = set{ "execute", ...defaults }
// Spread a range -- this is particularly elegant
digits = set{ ...(0..9) }
// Result: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
The
settoken is parser-gated so it does not collide with variables namedset. Your existing code is safe.
Converting from other types
// Array to Set -- deduplication for free
tags = [ "boxlang", "jvm", "boxlang", "oss" ].toSet()
tags.size() // 3
// Preserve insertion order while deduplicating
orderedTags = [ "c", "a", "b", "a" ].toSet( "linked" )
// Sorted from array
sorted = [ 9, 1, 5, 3 ].toSet( "sorted" )
// Split a delimited string directly into a Set
csv = "admin,editor,viewer,admin".listToSet()
csv.size() // 3
// Custom delimiter
pipe = "read|write|execute|read".listToSet( delimiter="|", type="linked" )
// From a Query -- distinct column values in one shot
q = queryNew( "name,dept", "varchar,varchar", [
[ "Alice", "Engineering" ],
[ "Bob", "Marketing" ],
[ "Carol", "Engineering" ]
] )
depts = q.columnData( "dept" ).toSet()
// Result: {"Engineering", "Marketing"}
The Three Variants in Practice
Choosing the right variant matters for correctness and performance.
Default (HashSet) -- when order doesn't matter
// Permission checking: order is irrelevant, membership speed is everything
userRoles = setOf( "editor", "viewer", "moderator" )
adminRoles = setOf( "admin", "superadmin" )
canAdmin = userRoles.some( r -> adminRoles.contains( r ) )
// false
// Feature flags -- constant-time lookup regardless of flag count
activeFlags = setNew( values=queryGetColumn( flagQuery, "flagName" ) )
if ( activeFlags.contains( "dark_mode_v2" ) ) {
// render dark mode
}
Linked (LinkedHashSet) -- when insertion order matters
// Breadcrumb trail -- visited pages in order, no revisits
trail = setNew( type="linked" )
trail.add( "/home" )
trail.add( "/products" )
trail.add( "/products/123" )
trail.add( "/home" ) // already there, silently ignored
trail.toArray()
// ["/home", "/products", "/products/123"]
// Processing pipeline stages -- ordered, deduplicated
pipeline = setNew( type="linked", values=[ "validate", "enrich", "normalize", "validate" ] )
for ( stage in pipeline ) {
runStage( stage ) // validate only runs once
}
Sorted (TreeSet) -- when natural order is always required
// Priority queue of version numbers treated as strings
versions = setNew( type="sorted", values=[ "1.14.0", "1.9.0", "2.0.0", "1.10.0" ] )
versions.toArray()
// ["1.10.0", "1.14.0", "1.9.0", "2.0.0"] -- lexicographic, watch your versioning scheme
// Integer ranges -- always sorted
scores = setNew( type="sorted" )
scores.addAll( [ 87, 42, 99, 55, 87, 42 ] )
scores.toArray()
// [42, 55, 87, 99]
// Get min and max cheaply via toArray()
arr = scores.toArray()
writeDump( "Low: #arr[ 1 ]#, High: #arr[ arr.len() ]#" )
Membership and Iteration
Testing membership
granted = setOf( "read", "write", "execute" )
granted.contains( "read" ) // true
granted.has( "delete" ) // false (alias for contains)
// Test multiple at once
granted.containsAll( [ "read", "write" ] ) // true
granted.containsAll( [ "read", "sudo" ] ) // false
Iterating
tags = setNew( type="linked", values=[ "boxlang", "jvm", "oss" ] )
// for-in loop
for ( tag in tags ) {
println( tag )
}
// each() -- cleaner in functional pipelines
tags.each( tag => {
processTag( tag )
} )
Set Algebra: The Real Power
This is where BoxSet earns its place. Four algebraic operations, available as both member methods and overloaded operators.
Union -- all unique elements from both sets
backendSkills = setOf( "java", "sql", "boxlang", "redis" )
frontendSkills = setOf( "javascript", "css", "boxlang", "react" )
allSkills = backendSkills.union( frontendSkills )
// {java, sql, boxlang, redis, javascript, css, react}
// Operator syntax: +
allSkills = backendSkills + frontendSkills
Intersection -- only what both sets share
teamA = setOf( "alice", "bob", "carol", "dan" )
teamB = setOf( "bob", "carol", "eve" )
sharedMembers = teamA.intersection( teamB )
// {bob, carol}
// Operator syntax: *
sharedMembers = teamA * teamB
Difference -- what's in A but not in B
allUsers = setOf( "alice", "bob", "carol", "dan", "eve" )
activeUsers = setOf( "alice", "carol", "eve" )
inactiveUsers = allUsers.difference( activeUsers )
// {bob, dan}
// Operator syntax: -
inactiveUsers = allUsers - activeUsers
Symmetric Difference -- what's in either but not both
lastWeekUsers = setOf( "alice", "bob", "carol" )
thisWeekUsers = setOf( "bob", "carol", "dan", "eve" )
// Who joined or left?
changed = lastWeekUsers.symmetricDifference( thisWeekUsers )
// {alice, dan, eve}
// Operator syntax: ^
changed = lastWeekUsers ^ thisWeekUsers
Operators accept "loose" right-hand operands
You don't need to convert everything to a Set first:
base = setOf( 1, 2, 3, 4, 5 )
result = base + [ 6, 7 ] // Set + Array
result = base * "3,4,5,6" // Set * comma-list string
result = base - (1..2) // Set - Range
// Compound assignment operators
base -= set{ 1, 2 }
// base is now {3, 4, 5}
base *= [ 3, 4 ]
// base is now {3, 4}
When neither operand is a Set, operators fall through to standard math:
2 + 3 = 5,4 * 3 = 12,2 ^ 3 = 8. Your existing arithmetic code is safe.
Functional Programming Pipeline
BoxSet ships with the same functional vocabulary you know from Arrays. Every operation returns a new Set (or a scalar), keeping the original untouched.
scores = setNew( type="sorted", values=[ 55, 72, 88, 91, 43, 88, 100 ] )
// Stored as: {43, 55, 72, 88, 91, 100}
// map -- transform every element, get a new Set back
bonusScores = scores.map( s -> s + 5 )
// {48, 60, 77, 93, 96, 105}
// filter -- keep matching elements
passing = scores.filter( s -> s >= 60 )
// {72, 88, 91, 100}
// reject -- the inverse of filter
failing = scores.reject( s -> s >= 60 )
// {43, 55}
// reduce -- collapse to a single value
total = scores.reduce( ( acc, s ) => acc + s, 0 )
// 449
average = total / scores.size()
// 74.83...
// every -- do all elements satisfy a predicate?
allPassing = scores.every( s -> s >= 60 )
// false
// some -- does at least one element satisfy a predicate?
hasA = scores.some( s -> s >= 90 )
// true
// none -- do zero elements satisfy a predicate?
noneNegative = scores.none( s -> s < 0 )
// true
// find -- first element matching a predicate
firstHigh = scores.find( s -> s >= 90 )
// 91 (or 100, iteration order not guaranteed for HashSet -- use sorted/linked for predictability)
Chaining works naturally:
result = setOf( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
.filter( n -> n % 2 == 0 ) // evens: {2, 4, 6, 8, 10}
.map( n -> n * n ) // squares: {4, 16, 36, 64, 100}
.reduce( ( acc, n ) => acc + n, 0 )
// 220
Real-World Scenarios
1. Role-Based Access Control
class PermissionService {
function canAccess( required userRoles, required string resource ) {
resourcePermissions = getResourcePermissions( resource )
// What roles does this user have that grant access?
return !userRoles.intersection( resourcePermissions ).isEmpty()
}
function mergeRoles( required userRoles, required groupRoles ) {
// A user gets the union of their personal roles and group roles
return userRoles.union( groupRoles )
}
function getEffectiveDenials( required userRoles, required deniedRoles ) {
// Roles the user has that are explicitly denied on this resource
return userRoles.intersection( deniedRoles )
}
private function getResourcePermissions( required string resource ) {
return {
"/admin" : setOf( "admin", "superadmin" ),
"/reports" : setOf( "admin", "analyst", "manager" ),
"/api" : setOf( "developer", "admin" )
}[ resource ] ?: setNew()
}
}
svc = new PermissionService()
userRoles = setOf( "editor", "analyst", "viewer" )
groupRoles = setOf( "manager", "viewer" )
effective = svc.mergeRoles( userRoles, groupRoles )
// {editor, analyst, viewer, manager}
canSeeReports = svc.canAccess( effective, "/reports" )
// true (analyst or manager match)
canSeeAdmin = svc.canAccess( effective, "/admin" )
// false
2. Tag Deduplication and Taxonomy Intersection
// Content tagging system
post1Tags = setNew( type="linked", values=[ "boxlang", "jvm", "performance", "oss" ] )
post2Tags = setNew( type="linked", values=[ "jvm", "java", "boxlang", "interop" ] )
post3Tags = setNew( type="linked", values=[ "performance", "caching", "redis", "oss" ] )
// Tags that appear in multiple posts
commonTagsP1P2 = post1Tags * post2Tags
// {boxlang, jvm}
// All unique tags across the content library
allTags = post1Tags + post2Tags + post3Tags
// {boxlang, jvm, performance, oss, java, interop, caching, redis}
// Tags unique to post1 -- good for "exclusive" badge
exclusiveToPost1 = post1Tags - post2Tags - post3Tags
// {boxlang} -- only "boxlang" appears in p1 and not both others... actually varies
// Find posts that share at least 2 tags with post1 (related posts)
relatedThreshold = 2
isRelated = ( post1Tags * post2Tags ).size() >= relatedThreshold
// true
3. Dataset Change Detection
A common ETL pattern: what changed between two snapshots of data?
function detectChanges( required previousIds, required currentIds ) {
prevSet = previousIds.toSet()
currSet = currentIds.toSet()
return {
"added" : currSet - prevSet, // new since last run
"removed" : prevSet - currSet, // gone since last run
"unchanged" : prevSet * currSet, // in both
"changed" : prevSet ^ currSet // anything that moved
}
}
yesterdayUsers = queryGetColumn( getYesterdayQuery(), "user_id" )
todayUsers = queryGetColumn( getTodayQuery(), "user_id" )
diff = detectChanges( yesterdayUsers, todayUsers )
writeDump( "New signups today: #diff.added.size()#" )
writeDump( "Churned users: #diff.removed.size()#" )
// Send welcome emails only to genuinely new users
diff.added.each( userId => {
emailService.sendWelcome( userId )
} )
4. URL Deduplication Pipeline with Functional Chaining
class CrawlerPipeline {
property name="visited" type="Set"
property name="queued" type="Set"
property name="blacklist" type="Set"
function init() {
variables.visited = setNew( type="linked" )
variables.queued = setNew( type="linked" )
variables.blacklist = setOf( "login", "logout", "admin" )
return this
}
function enqueue( required array urls ) {
// Normalize, reject blacklisted paths, exclude already visited
fresh = urls
.toSet( "linked" ) // deduplicate
.filter( url -> !isBlacklisted( url ) ) // drop blacklisted
.difference( variables.visited ) // drop already seen
.difference( variables.queued ) // drop already queued
variables.queued.addAll( fresh.toArray() )
return fresh.size()
}
function processNext() {
if ( variables.queued.isEmpty() ) return
// LinkedHashSet gives us FIFO via toArray()[1]
target = variables.queued.toArray().first()
variables.queued.remove( target )
variables.visited.add( target )
return crawl( target )
}
function stats() {
return {
"visited" : variables.visited.size(),
"queued" : variables.queued.size(),
"overlap" : ( variables.visited * variables.queued ).size() // should always be 0
}
}
private function isBlacklisted( required string target ) {
return variables.blacklist.some( b -> target.findNoCase( b ) > 0 )
}
}
Case Sensitivity and Numeric Normalization
By default, BoxSet is case-insensitive for strings, matching BoxLang's general dynamic semantics:
s = setNew( values=[ "Hello", "hello", "HELLO", "hElLo" ] )
s.size() // 1
s.contains( "HELLO" ) // true
s.contains( "hElLo" ) // true
Opt in to case sensitivity when you need exact-case uniqueness:
tokens = setNew( caseSensitive=true, values=[ "Bearer", "bearer", "BEARER" ] )
tokens.size() // 3
tokens.contains( "bearer" ) // true
tokens.contains( "Bearer" ) // true
tokens.contains( "beareR" ) // false
Numeric normalization is independent of case sensitivity. 1, 1L, 1.0, and 1.00 are always the same value in a Set:
nums = setNew( values=[ 1, 1.0, 1L, 1.00 ] )
nums.size() // 1
Java Interop
Because BoxSet wraps java.util.Set, the integration story is clean in both directions.
Wrapping an existing Java Set
import java.util.HashSet
javaSet = createObject( "java", "java.util.HashSet" ).init()
javaSet.add( "a" )
javaSet.add( "b" )
// Wrap it -- no copy, mutations propagate
bxSet = javaSet castAs "Set"
bxSet.add( "c" )
javaSet.contains( "c" ) // true -- same backing object
bxSet.size() // 3
Struct key and value sets
config = {
"host" : "localhost",
"port" : 5432,
"ssl" : true,
"database" : "myapp"
}
// Keys as a Set
keys = config.keySet()
keys.contains( "port" ) // true
// Values as a Set (deduplicated)
values = config.valueSet()
// Useful for checking whether any key overlaps with a forbidden list
forbidden = setOf( "password", "secret", "token", "key" )
hasSensitiveKeys = !keys.isDisjointFrom( forbidden )
// false -- none of our keys are in the forbidden list
Any Java library method returning a java.util.Set -- Spring Security granted authorities, JPA fetch results, Guava ImmutableSet -- works directly with BoxLang Set BIFs and member functions.
Unmodifiable Sets
When you need an immutable contract -- configuration, constants, lookup tables:
ALLOWED_METHODS = setOf( "GET", "POST", "PUT", "DELETE", "PATCH" ).toUnmodifiable()
ALLOWED_METHODS.size() // 5
ALLOWED_METHODS.contains( "GET" ) // true
// Mutation attempts throw UnmodifiableException at runtime
ALLOWED_METHODS.add( "TRACE" ) // throws!
// Thaw to get a fresh mutable copy when you need to extend it
extended = ALLOWED_METHODS.toModifiable()
extended.add( "HEAD" )
extended.size() // 6 -- ALLOWED_METHODS still has 5
This pairs perfectly with constants declared in a BoxLang class:
class HttpConstants {
SAFE_METHODS = setOf( "GET", "HEAD", "OPTIONS" ).toUnmodifiable()
UNSAFE_METHODS = setOf( "POST", "PUT", "DELETE", "PATCH" ).toUnmodifiable()
ALL_METHODS = ( SAFE_METHODS + UNSAFE_METHODS ).toUnmodifiable()
function isSafe( required string method ) {
return SAFE_METHODS.contains( method.uCase() )
}
}
JSON Serialization
Sets serialize to JSON arrays, which means they round-trip cleanly with any JSON-consuming API:
s = setNew( type="linked", values=[ "boxlang", "jvm", "oss" ] )
json = jsonSerialize( s )
// ["boxlang","jvm","oss"]
// Or via member function
json = s.toJSON()
// Nested in a struct
payload = {
"user" : "alice",
"roles" : setOf( "editor", "viewer" ),
"tags" : setNew( type="linked", values=[ "premium", "beta" ] )
}
jsonSerialize( payload )
// {"user":"alice","roles":["editor","viewer"],"tags":["premium","beta"]}
Quick BIF Reference
| BIF | Member Function | Purpose |
|---|---|---|
setNew( [type], [values], [caseSensitive] ) | -- | Create new Set |
setOf( ...values ) | -- | Create from varargs |
boxSetAdd( set, value ) | s.add( v ) | Add one element |
boxSetAddAll( set, collection ) | s.addAll( col ) | Add many elements |
boxSetRemove( set, value ) | s.remove( v ) | Remove one element |
boxSetRemoveAll( set, collection ) | s.removeAll( col ) | Remove many elements |
boxSetRetainAll( set, collection ) | s.retainAll( col ) | Keep only specified |
boxSetClear( set ) | s.clear() | Remove all |
boxSetContains( set, value ) | s.contains( v ) | Membership test |
boxSetContainsAll( set, col ) | s.containsAll( col ) | All-membership test |
boxSetIsEmpty( set ) | s.isEmpty() | Empty check |
boxSetEquals( a, b ) | a.equals( b ) | Equality |
boxSetIsSubsetOf( a, b ) | a.isSubsetOf( b ) | Subset test |
boxSetIsSupersetOf( a, b ) | a.isSupersetOf( b ) | Superset test |
boxSetIsDisjointFrom( a, b ) | a.isDisjointFrom( b ) | No-overlap test |
boxSetUnion( a, b ) | a.union( b ) / a + b | All elements |
boxSetIntersection( a, b ) | a.intersection( b ) / a * b | Common elements |
boxSetDifference( a, b ) | a.difference( b ) / a - b | A minus B |
boxSetSymmetricDifference( a, b ) | a.symmetricDifference( b ) / a ^ b | Either not both |
boxSetEach( set, cb ) | s.each( cb ) | Iterate |
boxSetMap( set, cb ) | s.map( cb ) | Transform |
boxSetFilter( set, cb ) | s.filter( cb ) | Keep matching |
boxSetReject( set, cb ) | s.reject( cb ) | Remove matching |
boxSetReduce( set, cb, init ) | s.reduce( cb, init ) | Fold to value |
boxSetEvery( set, cb ) | s.every( cb ) | All match? |
boxSetSome( set, cb ) | s.some( cb ) | Any match? |
boxSetNone( set, cb ) | s.none( cb ) | None match? |
boxSetFind( set, cb ) | s.find( cb ) | First match |
boxSetSize( set ) | s.size() / s.len() | Element count |
boxSetToArray( set ) | s.toArray() | Convert to Array |
boxSetToList( set, [delim] ) | s.toList( [delim] ) | Convert to list string |
Wrap Up
BoxSet is not a cosmetic feature. It's a fundamental collection type that was missing from the language and is now present everywhere you need it: in literal syntax, in operators, in the functional pipeline, in JSON, in Java interop, and in the type system itself.
The operator overloads alone (+, -, *, ^) turn multi-step collection algebra into single expressions. The three backing variants mean you always get the right performance characteristics for your use case. And the full functional pipeline keeps your code declarative and composable instead of imperative and fragile.
BoxSet ships in BoxLang 1.14.0. Update, try the literal syntax, and reach for a Set the next time you catch yourself deduplicating an array with a loop.
Resources:
BoxLang is built by Ortus Solutions -- the team behind ColdBox, CommandBox, and 350+ open-source libraries. BoxLang is a modern dynamic JVM language designed to run everywhere: web, Lambda, CLI, desktop, Android, and beyond.
Add Your Comment