PostHole
Compose Login
You are browsing eu.zone1 in read-only mode. Log in to participate.
rss-bridge 2026-02-25T13:45:02+00:00

Python Type Checker Comparison: Empty Container Inference

Comments


Python Type Checker Comparison: Empty Container Inference

February 25, 2026 · 9 min read

Danny Yang

Jia Chen

Empty containers like [] and {} are everywhere in Python. It's super common to see functions start by creating an empty container, filling it up, and then returning the result.

Take this, for example:

def my_func(ys: dict[str, int]):
x = {}
for k, v in ys.items():
if some_condition(k):
x.setdefault("group0", []).append((k, v))
else:
x.setdefault("group1", []).append((k, v))
return x

This seemingly innocent coding pattern poses an interesting challenge for Python type checkers. Normally, when a type checker sees x = y without a type hint, it can just look at y to figure out x's type. The problem is, when y is an empty container (like x = {} above), the checker knows it's a list or a dict, but has no clue what's going inside.

The big question is: How is the type checker supposed to analyze the rest of the function without knowing x's type?

Different type checkers implement distinct strategies to answer this question. This post will examine these different approaches, weighing their pros and cons, and which type checkers implement each approach. The information presented should be useful as you evaluate and select a type checker.

Strategy 1: Infer Any type for container elements​

The simplest approach is just to use Any type for the items in the container. E.g. if the developer writes x = [] then the type checker would infer the type of x to be list[Any]. This is what Pyre, Ty, and Pyright1 mostly behave like at the time of writing.

Since the analysis does not require looking at any surrounding context at all, this approach is probably the easiest to understand and at the same time the most efficient for a type checker to implement.

Among the inference strategies we discuss today, inferring list[Any] produces the least amount of type errors: developers can insert anything into the list, and items read from the list will also be Any.

On the other hand, by inferring Any we are effectively giving up type safety. The type checker won't have any false-positive errors, but it also won't catch any bugs. In our experience using Pyre in Instagram, this lets expensive runtime crashes slip into production.

To illustrate the pitfalls, let’s look at code snippet simplified from a real-world example:

from dataclasses import dataclass

@dataclass
class MenuItem:
title: str | None
details: list[str]

def first_three_lines(menu_item: MenuItem) -> list[str]:
lines = []  # `lines` is inferred as `list[Any]`
if menu_item.title is not None:
lines.append(menu_item.title)
# bug: `list.append()` should be `list.extend()`
lines.append(menu_item.details)
return lines[:2]

This example has a bug where we accidentally call append instead of extend, creating a nested list instead of extending the list of strings. Using this strategy, the type checker infers the type of lines as list[Any] and does not warn the developer.

To improve type safety in these situations, type checkers that infer Any for empty containers can choose to generate extra type errors that warn the user about the insertion of an Any type. While this can reduce false negatives, it burdens developers by forcing them to explicitly annotate every empty container in order to silence the warnings.

Strategy 2: Infer the container type from all usages​

To infer a more precise type for an empty container, a type checker can look ahead and see how it is used.

def my_func(some_condition):
x = []
if some_condition:
x.append(1)
return x # `x` is inferred as `list[int]`

In this example, we see that after the initial definition of x, there’s a subsequent usage of x that attempts to append 1 to it. Based on that usage, it’s highly likely that the intended type for x would be list[int], and the type checker could infer that as the type for x.

But what happens if x has multiple uses later and each of them assumes different types? For example:

def my_func2(some_condition):
x = []
if some_condition:
x.append(1)
else:
x.append("foo")
return x # `x` is inferred as `list[int | str]`

One solution is to look at all usages of x, assume all of them are valid, and infer the element type of the container to be a union of them all. In our example above, the type checker would infer list[int | str]. This is the approach taken by the Pytype type checker.

This approach is very permissive for code that tries to insert into the containers: no type errors will be reported at insertion time at all, similar to the infer-Any strategies mentioned before. However, it provides better type safety over the infer-Any strategy when elements are read out of the container: any operations applied to the element must be valid for all inferred types in the union.

For instance:

def my_func3(some_condition):
x = []
if some_condition:
x.append(1)
else:
x.append("foo")
return x[0] + 1  # ERROR! `+` is not applicable to `int | str` and `int`

If we call my_func3(False), we will get a runtime crash that happens exactly where the type error is reported, so this inference strategy is also the one that models the Python runtime most closely.

However, the location of a runtime crash is not necessarily the same as the location of the bug that caused the crash.

While it’s helpful that the type checker gives an error when we try to use a value from a list that was modified incorrectly, what’s arguably more helpful is to know where we put the wrong thing into the list in the first place.

from dataclasses import dataclass

@dataclass
class MenuItem:
title: str | None
details: list[str]

def first_three_lines(menu_item: MenuItem) -> list[str]:
lines = []
if menu_item.title is not None:
lines.append(menu_item.title)
# bug: `list.append()` should be `list.extend()`
lines.append(menu_item.details)
return lines[:2] # error: `list[str | list[str]]` is not assignable to `list[str]`

In the example above, the error would be raised on the return statement (saying that list[str | list[str]] can’t be assigned to list[str]), even though the issue we actually need to fix is on the previous line. While this example is only off by a single line, in the real world the location of the error and the location of the bug could be separated by hundreds of lines of code.

Implementing this empty container inference strategy in a type checker faces several practical engineering hurdles.

  • Firstly, finding all usages of a variable (especially those in non-local scopes like global variables or class attributes) can be computationally expensive, so a type checker may need to compromise on exhaustiveness to maintain good performance.
  • Secondly, if a container has many distinct usages, the resulting inferred element type can be a lengthy and complex union, and this complexity can negatively impact the readability of subsequent error messages, necessitating the use of additional heuristics to improve error message quality.

Strategy 3: Infer the container type from only the first usage​

This is similar to strategy 2, but we infer the type based only on the first time the list is used, where “first” is defined in terms of syntactical appearance.

def my_func2(some_condition):
x = []
if some_condition:
x.append(1) # `x` is inferred as `list[int]`
else:
x.append("foo") # error: cannot append `str` to `list[int]`
return x

For example, in my_func2() above, the type checker infers the type for x based on its first usage x.append(1), which results in list[int]. When we do x.append("foo") on the following lines, the type checker emits an error saying we cannot insert str into list[int].

[...]


Original source

Reply