HN.zip

Comparing Python Type Checkers: Typing Spec Conformance

70 points by ocamoss - 22 comments
extr [3 hidden]5 mins ago
Wow, quite surprising results. I have been working on a personal project with the astral stack (uv, ruff, ty) that's using extremely strict lint/type checking settings, you could call it an experiment in setting up a python codebase to work well with AI. I was not aware that ty's gaps were significant. I just tried with zuban + pyright. Both catch a half dozen issues that ty is ignoring. Zuban has one FP and one FN, pyright is 100% correct.

Looks like I will be converting to pyright. No disrespect to the astral team, I think they have been pretty careful to note that ty is still in early days. I'm sure I will return to it at some point - uv and ruff are excellent.

persedes [3 hidden]5 mins ago
Article is a nice write up of https://htmlpreview.github.io/?https://github.com/python/typ...

(glad they include ty now)

pgwalsh [3 hidden]5 mins ago
Using VSCodium I was having issues with python type checkers for quite a while. I did the basedpyright thing for a while but that was painful. It's a bit too based for me, and I'm not sure i'd call it based. Right now I have uv, ruff, and ty and I'm happy with it. It's super easy to update and super fast. I didn't realize the coverage wasn't as good as some others but I still like it. I may have to try pyrefly. Never heard of it until this post, so thank you.
martinky24 [3 hidden]5 mins ago
I've been using ty on some previously untyped codebases at work. It does a good job of being fast and easy to use while catching many issues without being overly draconian.

My teammates who were writing untyped Python previously don't seem to mind it. It's a good addition to the ecosystem!

tfrancisl [3 hidden]5 mins ago
And it makes it infinitely easier for them to get with the times and start typing their code!
rirze [3 hidden]5 mins ago
I am worried about the false negatives/positive rate however. Hope it improves.
notatallshaw [3 hidden]5 mins ago
My understand is Astral's focus for ty has been on making a good experience for common issues, whereas they plan for very high compliance but difficult or rare edge cases aren't are prioritized.

Compliance suite numbers are biased towards edge cases and not the common path because that's where a lot of the tests need to be added.

My advise is to see how each type checker runs against your own codebase and if the output/performance is something you are happy with.

dcreager [3 hidden]5 mins ago
> My understand is Astral's focus for ty has been on making a good experience for common issues, whereas they plan for very high compliance but difficult or rare edge cases aren't are prioritized.

I would say that's true in terms of prioritization (there's a lot to do!), but not in terms of the final user experience that we are aiming for. We're not planning on punting on anything in the conformance suite, for instance.

ddxv [3 hidden]5 mins ago
I've used mypy forever and never even tried these others. Looking at them though it looks like it's worth trying out Zuban or Pyright? Is there a noticeable benefit when switching between different checkers?
rirze [3 hidden]5 mins ago
If you care about correctness, unless you pick pyright, don't bother at the moment. If you're creating a new project and looking for a promise for better faster typing, then pick one of Zuban, Pyrefly, or ty.
x187463 [3 hidden]5 mins ago
Speed, especially in larger codebases.
Scene_Cast2 [3 hidden]5 mins ago
Are there any good static (i.e. not runtime) type checkers for arrays and tensors? E.g. "16x64x256 fp16" in numpy, pytorch, jax, cupy, or whatever framework. Would be pretty useful for ML work.
ocamoss [3 hidden]5 mins ago
We're working on statically checking Jaxtyping annotations in Pyrefly, but it's incomplete and not ready to use yet :)
dcreager [3 hidden]5 mins ago
There have been some early proposals to add something like that, but none of them have made it very far yet. As you might imagine, it's a hard problem!
westurner [3 hidden]5 mins ago
- /?hnlog pycontract icontract https://westurner.github.io/hnlog/ :

From https://news.ycombinator.com/item?id=14246095 (2017) :

> PyContracts supports runtime type-checking and value constraints/assertions (as @contract decorators, annotations, and docstrings).

> Unfortunately, there's yet no unifying syntax between PyContracts and the newer python type annotations which MyPy checks at compile-type.

Or beartype.

Pycontracts has: https://andreacensi.github.io/contracts/ :

  @contract
  def my_function(a : 'int,>0', b : 'list[N],N>0') -> 'list[N]':
  
  @contract(image='array[HxWx3](uint8),H>10,W>10')
  def recolor(image):
For icontract, there's icontract-hyothesis.

parquery/icontract: https://github.com/Parquery/icontract :

> There exist a couple of contract libraries. However, at the time of this writing (September 2018), they all required the programmer either to learn a new syntax (PyContracts) or to write redundant condition descriptions ( e.g., contracts, covenant, deal, dpcontracts, pyadbc and pcd).

  @icontract.require(lambda x: x > 3, "x must not be small")
  def some_func(x: int, y: int = 5) -> None:
icontract with numpy array types:

  @icontract.require(lambda arr: isinstance(arr, np.ndarray))
  @icontract.require(lambda arr: arr.shape == (3, 3))
  @icontract.require(lambda arr: np.all(arr >= 0), "All elements must be non-negative")
  def process_matrix(arr: np.ndarray):
      return np.sum(arr)

  invalid_matrix = np.array([[1, -2, 3], [4, 5, 6], [7, 8, 9]])
  process_matrix(invalid_matrix)
  # Raises icontract.ViolationError
Pay08 [3 hidden]5 mins ago
I still can't get over the utter idiocy in Python's type hints being decorative. In what world does x: int = "thing" not give someone in the standardisation process pause?
dcreager [3 hidden]5 mins ago
Can you elaborate what you mean by decorative?

If you run a type checker like ty or pyright they're not decorative — you'll get clear diagnostics for that particular example [1], and any other type errors you might have. You can set up CI so that e.g. blocks PRs from being merged, just like any other test failure.

If you mean types not being checked at runtime, the consensus is that most users don't want to pay the cost of the checks every time the program is run. It's more cost-effective to do those checks at development/test/CI time using a type checker, as described above. But if you _do_ want that, you can opt in to that using something like beartype [2].

[1] https://play.ty.dev/905db656-e271-4a3a-b27d-18a4dd45f5da

[2] https://github.com/beartype/beartype/

badlibrarian [3 hidden]5 mins ago
It's a community that delayed progress for a decade while they waited for everyone to put parenthesis on the print statement. Give 'em enough time and they'll figure out best practices.
Spivak [3 hidden]5 mins ago
In C-ish languages the statement

    int x = "thing"
is perfectly valid. It means reserve a spot for a 32 bit int and then shove the pointer to the string "thing" at the address of x. It will do the wrong thing and also overflow memory but you could generate code for it. The type checker is what stops you. It's the same in Python, if you make type checking a build breaker then the annotations mean something. Types aren't checked at runtime but C doesn't check them either.
lefra [3 hidden]5 mins ago
In C, int may be as small as 16 bits You may get 32 bits (or more) but it's not guaranteed. I don't see how you get a memory overflow though?

I'd be surprised if a compiler with -Wall -Werror accepts to compile this.

Trying to cast back the int to a char* might work if the pointers are the same size as int on the target platform, but it's actually Undefined Behaviour IIRC.

Daishiman [3 hidden]5 mins ago
It's the complete opposite. The objective of type hints is that they're optional precisely because type hints narrow the functionality of the language. And evidenced by the fact that different type checks have different heuristics for determining what is a valid typed program and what isn't, it seems that the decision is correct.

No type system will allow for the dynamism that Python supports. It's not a question of how you annotate types, it's about how you resolve types.

hrmtst93837 [3 hidden]5 mins ago
Optional on paper, sure. Once you publish shared libs or keep a nontrivial repo usable across teams, type hints stop feeling optional fast, because the minute mypy, pyright, and Pyre disagree on metaprogramming or runtime patching you get three incompatible stories about the same program and a pile of contraditions instead of signal. Python can stay dynamic, yet this setup mostly buys busywork for CI and false confidence for humans.