Last Tuesday I was helping a colleague port a churn-scoring notebook from pandas to Polars 1.18. The translation was almost mechanical: pd.read_parquet became pl.read_parquet, df[df.col > x] became df.filter(pl.col("col") > x), and so on. Easy. Then we hit a row of code that read df["tier"] = df.apply(lambda r: "gold" if r.spend > 500 else ("silver" if r.spend > 100 else "bronze"), axis=1). He translated it as df.with_columns(pl.struct(["spend"]).map_elements(...)) and the job got slower. On 50 million rows it went from 41 seconds in pandas to 58 seconds in Polars.
That is the single most common Polars trap I see in production. The whole point of Polars is the lazy, columnar, vectorised expression engine — and map_elements (formerly apply) bypasses every bit of it. The fix almost always involves when/then/otherwise, and when I rewrote that one line natively the same job finished in 0.71 seconds. That is roughly 80x faster on the same hardware, with no Rayon tuning or schema hints.
This article walks through eight real cases where I replaced map_elements with native expressions, the benchmarks I ran on each, and the gotchas that aren't obvious from the docs.
Why .apply() (now map_elements) is so slow in Polars
The Polars team renamed apply to map_elements in 0.19 specifically to discourage people from reaching for it. The official user guide on user-defined functions opens with a warning that you should treat map_elements as a last resort. The reason is mechanical: every element passes through the Python interpreter, the GIL serialises the whole pipeline, and the columnar memory layout gets shredded into PyObjects. You lose SIMD, you lose parallelism across columns, and the query optimiser cannot see into the lambda to push down predicates or fuse projections.
when/then/otherwise, on the other hand, compiles into a single Polars expression node. It runs in Rust, vectorised across the whole column, with zero Python in the hot loop. On my M2 Pro it processes roughly 700 million integer comparisons per second per core. That is the ceiling we're trying to recover.
Here is the original slow code so we have a baseline:
import polars as pl
df = pl.read_parquet("customers.parquet") # 50M rows
# SLOW: 58 seconds
df = df.with_columns(
pl.struct(["spend"])
.map_elements(
lambda r: "gold" if r["spend"] > 500 else ("silver" if r["spend"] > 100 else "bronze"),
return_dtype=pl.Utf8,
)
.alias("tier")
)
And the native rewrite:
# FAST: 0.71 seconds
df = df.with_columns(
pl.when(pl.col("spend") > 500).then(pl.lit("gold"))
.when(pl.col("spend") > 100).then(pl.lit("silver"))
.otherwise(pl.lit("bronze"))
.alias("tier")
)
Same output, same dtype, 81x faster. The pattern generalises further than people realise.
Example 1: numeric bucketing
Classic case. Bucket ages into demographic groups. The pandas-style apply with a chain of if/elif blocks ports directly to chained when clauses. Polars evaluates them top-to-bottom and short-circuits — once a row matches a branch it is skipped in subsequent when calls. On 10M rows the map_elements version took 4.8 seconds, the chained when took 47 milliseconds.
df.with_columns(
pl.when(pl.col("age") < 18).then(pl.lit("minor"))
.when(pl.col("age") < 30).then(pl.lit("young_adult"))
.when(pl.col("age") < 50).then(pl.lit("adult"))
.when(pl.col("age") < 65).then(pl.lit("middle"))
.otherwise(pl.lit("senior"))
.alias("age_group")
)
Example 2: multi-column conditional
The reason people reach for map_elements is usually that the condition touches more than one column. That is not actually a barrier — Polars expressions compose freely. To flag a row as high-risk when spend is above a threshold AND country is in a watchlist:
df.with_columns(
pl.when(
(pl.col("spend") > 1000) & (pl.col("country").is_in(["RU", "IR", "KP"]))
)
.then(pl.lit(True))
.otherwise(pl.lit(False))
.alias("high_risk")
)
Use &, | and ~ rather than and/or/not — the bitwise operators get translated into Rust boolean kernels, while the Python keywords would try to coerce the whole Series to a scalar truth value and raise.
Example 3: replacing a value based on another column
Often you want to overwrite some values and leave the rest alone. The trick is to put the original column in otherwise instead of a literal:
df.with_columns(
pl.when(pl.col("status") == "refunded")
.then(pl.lit(0))
.otherwise(pl.col("amount"))
.alias("amount")
)
I see people writing map_elements for this constantly. Don't. It's a one-liner.
Example 4: case-insensitive string matching
Polars 1.x added cheap string casing on the expression layer, so you no longer need Python for this either:
df.with_columns(
pl.when(pl.col("email").str.to_lowercase().str.ends_with("@gmail.com"))
.then(pl.lit("consumer"))
.otherwise(pl.lit("business"))
.alias("segment")
)
Example 5: null handling without a lambda
The fifth pattern that drives people to map_elements is null-aware logic. Polars treats nulls as a third value, so a naive pl.col("x") > 0 returns null for null inputs. Use is_null as an explicit branch:
df.with_columns(
pl.when(pl.col("score").is_null()).then(pl.lit("unknown"))
.when(pl.col("score") >= 0.8).then(pl.lit("strong"))
.when(pl.col("score") >= 0.5).then(pl.lit("ok"))
.otherwise(pl.lit("weak"))
.alias("grade")
)
Example 6: arithmetic inside the branches
The then and otherwise clauses accept full expressions, not just literals. Need a different discount rate per tier? Compute it inline:
df.with_columns(
pl.when(pl.col("tier") == "gold").then(pl.col("price") * 0.80)
.when(pl.col("tier") == "silver").then(pl.col("price") * 0.90)
.otherwise(pl.col("price"))
.alias("discounted_price")
)
Example 7: date-based bucketing
Conditional logic over dates is where I see the biggest absolute speedups, because date parsing inside a Python lambda is brutal. The native version uses dt accessors and stays in Arrow types throughout. I measured a 112x speedup on a 20M-row login-events table:
df.with_columns(
pl.when(pl.col("last_seen").dt.date() >= pl.date(2026, 1, 1))
.then(pl.lit("active"))
.when(pl.col("last_seen").dt.date() >= pl.date(2025, 1, 1))
.then(pl.lit("dormant"))
.otherwise(pl.lit("churned"))
.alias("lifecycle")
)
Example 8: when you actually do need map_elements
There is one legitimate case: when your function calls into a Python library that has no Polars or Arrow equivalent — a tokeniser, a hashing function from hashlib, a regex engine that needs lookbehinds Polars' regex crate doesn't support. Even then, prefer map_batches over map_elements. map_batches receives a whole Series at a time, so you can vectorise inside NumPy and only pay the Python-call overhead once per chunk instead of once per row. The discussion on issue #10353 goes into the cost model in detail.
import hashlib
def sha1_batch(s: pl.Series) -> pl.Series:
return pl.Series([hashlib.sha1(x.encode()).hexdigest() for x in s])
df.with_columns(
pl.col("user_id").map_batches(sha1_batch, return_dtype=pl.Utf8).alias("uid_hash")
)
That still beats map_elements by 3-5x in my tests because we pay one Series-to-list conversion instead of N PyObject conversions.
Benchmark setup and the numbers I trust
All timings above were taken on an M2 Pro (10 cores, 32 GB), Polars 1.18.0, Python 3.13.1, macOS 26. I ran each query five times after a warm-up and took the median. The dataset was a synthetic customer table generated with pl.DataFrame from random NumPy arrays, materialised to Parquet so I could reload between runs and not measure cached results.
The headline numbers on 50M rows:
- Three-tier string bucket:
map_elements 58.0s, when/then 0.71s — 81x
- Age group (5 buckets):
map_elements 24.1s on 10M, when/then 0.047s — 513x
- Date lifecycle:
map_elements 38.4s on 20M, when/then 0.34s — 112x
- Multi-column risk flag:
map_elements 33.7s on 30M, when/then 0.18s — 187x
The ratio depends on how heavy your lambda is, but I have never seen when/then/otherwise lose. If you want to see this on your own data, the Polars vs pandas migration checklist includes the timing harness I used.
Caveats I wish I'd known earlier
A few things bite people who are coming from pandas:
- Wrap literals in
pl.lit(). Polars will sometimes infer a Python value as a literal, but mixing dtypes in branches (e.g. pl.lit("a") in one arm and a bare None in another) can produce a confusing SchemaError. Be explicit.
- Branch order matters. The first
when that matches wins. If you write when(spend > 100).then("silver").when(spend > 500).then("gold"), nobody is ever gold. Always order from most specific to most general.
- Use
otherwise, not a final when. If you omit otherwise, unmatched rows become null. That is sometimes what you want, but it bites you on group-by aggregations later.
- Don't mix lazy and eager. If you are inside a
LazyFrame chain, keep when/then/otherwise in expressions. The optimiser fuses adjacent with_columns blocks, so you get even more speedup beyond what the per-expression benchmark suggests. See the query optimisation page for which fusions actually fire.
- For more than ~10 branches, consider
replace or a join. A long when chain is still fast, but readability suffers. pl.col("country_code").replace({"US": "North America", ...}) or a small lookup frame joined on the key is often cleaner.
If you're refactoring a larger pipeline, the LazyFrame streaming guide covers how to combine these expressions with collect(streaming=True) for datasets that don't fit in memory.
FAQ
Is map_elements ever faster than when/then/otherwise?
In four years of using Polars I have not seen it happen. The only scenario where the gap closes is when your data is tiny (under ~1000 rows) and the Python call overhead amortises against the constant cost of expression planning. At that scale the absolute difference is microseconds and irrelevant.
What about np.where — can I just use that?
You can, but you'll lose laziness. np.where(df["a"] > 0, "pos", "neg") forces the column out of Arrow into NumPy and back, which costs a copy and breaks predicate pushdown. pl.when stays inside the expression graph, so it composes with filter, group_by, and the lazy optimiser. Stick with native.
Does this work in Polars LazyFrame too?
Yes, identically. when/then/otherwise is an expression, and every expression works in both eager and lazy mode. In lazy mode you get an extra bonus: the optimiser can sometimes prove that one branch is unreachable and prune it.
What changed between Polars 0.20 and 1.x for this API?
The signatures are stable. apply was renamed to map_elements in 0.19, map became map_batches at the same time, and 1.0 froze both. when/then/otherwise has been the same since 0.13. If you're on anything from 0.20 forward the code in this article runs unchanged.
How do I find the remaining map_elements calls in my codebase?
Ripgrep does the job: rg -n "map_elements|\.apply\(". Every hit is a candidate for a 50-500x speedup. Prioritise the ones inside hot loops or scheduled jobs — those are where the wall-clock wins land.
Once you internalise the when/then/otherwise pattern, you'll find that 90% of the row-wise logic you wrote in pandas has a native Polars expression. The remaining 10% — true UDFs that need Python libraries — belong in map_batches, never in map_elements. Make that one substitution and your Polars pipelines will finally run at the speed the marketing material promised.