HN.zip

The first year of free-threaded Python

236 points by rbanffy - 220 comments
sgarland [3 hidden]5 mins ago
> Instead, many reach for multiprocessing, but spawning processes is expensive

Agreed.

> and communicating across processes often requires making expensive copies of data

SharedMemory [0] exists. Never understood why this isn’t used more frequently. There’s even a ShareableList which does exactly what it sounds like, and is awesome.

[0]: https://docs.python.org/3/library/multiprocessing.shared_mem...

chubot [3 hidden]5 mins ago
Spawning processes generally takes much less than 1 ms on Unix

Spawning a PYTHON interpreter process might take 30 ms to 300 ms before you get to main(), depending on the number of imports

It's 1 to 2 orders of magnitude difference, so it's worth being precise

This is a fallacy with say CGI. A CGI in C, Rust, or Go works perfectly well.

e.g. sqlite.org runs with a process PER REQUEST - https://news.ycombinator.com/item?id=3036124

kragen [3 hidden]5 mins ago
To be concrete about this, http://canonical.org/~kragen/sw/dev3/forkovh.c took 670μs to fork, exit, and wait on the first laptop I tried it on, but only 130μs compiled with dietlibc instead of glibc, and with glibc on a 2.3 GHz E5-2697 Xeon, it took 130μs compiled with glibc.

httpdito http://canonical.org/~kragen/sw/dev3/server.s (which launches a process per request) seems to take only about 50μs because it's not linked with any C library and therefore only maps 5 pages. Also, that doesn't include the time for exit() because it runs multiple concurrent child processes.

On this laptop, a Ryzen 5 3500U running at 2.9GHz, forkovh takes about 330μs built with glibc and about 130–140μs built with dietlibc, and `time python3 -c True` takes about 30000–50000μs. I wrote a Python version of forkovh http://canonical.org/~kragen/sw/dev3/forkovh.py and it takes about 1200μs to fork(), _exit(), and wait().

If anyone else wants to clone that repo and test their own machines, I'm interested to hear the results, especially if they aren't in Linux. `make forkovh` will compile the C version.

1200μs is pretty expensive in some contexts but not others. Certainly it's cheaper than spawning a new Python interpreter by more than an order of magnitude.

ori_b [3 hidden]5 mins ago
As another example: I run https://shithub.us with shell scripts, serving a terabyte or so of data monthly (mostly due to AI crawlers that I can't be arsed to block).

I'm launching between 15 and 3000 processes per request. While Plan 9 is about 10x faster at spawning processes than Linux, it's telling that 3000 C processes launching in a shell is about as fast as one python interpreter.

kstrauser [3 hidden]5 mins ago
The interpreter itself is pretty quick:

  ᐅ time echo "print('hi'); exit()" | python
  hi
  
  ________________________________________________________
  Executed in   21.48 millis    fish           external
     usr time   16.35 millis  146.00 micros   16.20 millis
     sys time    4.49 millis  593.00 micros    3.89 millis
charleshn [3 hidden]5 mins ago
> Spawning processes generally takes much less than 1 ms on Unix

It depends on whether one uses clone, fork, posix_spawn etc.

Fork can take a while depending on the size of the address space, number of VMAs etc.

crackez [3 hidden]5 mins ago
Fork on Linux should use copy-on-write vmpages now, so if you fork inside python it should be cheap. If you launch a new Python process from let's say the shell, and it's already in the buffer cache, then you should only have to pay the startup CPU cost of the interpreter, since the IO should be satisfied from buffer cache...
charleshn [3 hidden]5 mins ago
> Fork on Linux should use copy-on-write vmpages now, so if you fork inside python it should be cheap.

No, that's exactly the point I'm making, copying PTEs is not cheap on a large address space, woth many VMAs.

You can run a simple python script allocating a large list and see how it affects fork time.

knome [3 hidden]5 mins ago
for glibc and linux, fork just calls clone. as does posix_spawn, using the flag CLONE_VFORK.
morningsam [3 hidden]5 mins ago
>Spawning a PYTHON interpreter process might take 30 ms to 300 ms

Which is why, at least on Linux, Python's multiprocessing doesn't do that but fork()s the interpreter, which takes low-single-digit ms as well.

zahlman [3 hidden]5 mins ago
Even when the 'spawn' strategy is used (default on Windows, and can be chosen explicitly on Linux), the overhead can largely be avoided. (Why choose it on Linux? Apparently forking can cause problems if you also use threads.) Python imports can be deferred (`import` is a statement, not a compiler or pre-processor directive), and child processes (regardless of the creation strategy) name the main module as `__mp_main__` rather than `__main__`, allowing the programmer to distinguish. (Being able to distinguish is of course necessary here, to avoid making a fork bomb - since the top-level code runs automatically and `if __name__ == '__main__':` is normally top-level code.)

But also keep in mind that cleanup for a Python process also takes time, which is harder to trace.

Refs:

https://docs.python.org/3/library/multiprocessing.html#conte... https://stackoverflow.com/questions/72497140

kstrauser [3 hidden]5 mins ago
I really wish Python had a way to annotate things you don't care about cleaning up. I don't know what the API would look like, but I imagine something like:

  l = list(cleanup=False)
  for i in range(1_000_000_000): l.append(i)
telling the runtime that we don't need to individually GC each of those tiny objects and just let the OS's process model free the whole thing at once.

Sure, close TCP connections before you kill the whole thing. I couldn't care less about most objects, though.

Sharlin [3 hidden]5 mins ago
Unix is not the only platform though (and is process creation fast on all Unices or just Linux?) The point about interpreter init overhead is, of course, apt.
btilly [3 hidden]5 mins ago
Process creation should be fast on all Unices. If it isn't, then the lowly shell script (heavily used in Unix) is going to perform very poorly.
kragen [3 hidden]5 mins ago
While I think you've been using Unix longer than I have, shell scripts are known for performing very poorly, and on PDP-11 Unix (where perhaps shell scripts were most heavily used, since Perl didn't exist yet) fork() couldn't even do copy-on-write; it had to literally copy the process's entire data segment, which in most cases also contained a copy of its code. Moving to paged machines like the VAX and especially the 68000 family made it possible to use copy-on-write, but historically speaking, Linux has often been an order of magnitude faster than most other Unices at fork(). However, I think people mostly don't use those Unices anymore. I imagine the BSDs have pretty much caught up by now.

https://news.ycombinator.com/item?id=44009754 gives some concrete details on fork() speed on current Linux: 50μs for a small process, 700μs for a regular process, 1300μs for a venti Python interpreter process, 30000–50000μs for Python interpreter creation. This is on a CPU of about 10 billion instructions per second per core, so forking costs on the order of ½–10 million instructions.

fredoralive [3 hidden]5 mins ago
Python runs on other operating systems, like NT, where AIUI processes are rather more heavyweight.

Not all use cases of Python and Windows intersect (how much web server stuff is a Windows / IIS / SQL Server / Python stack? Probably not many, although WISP is a nice acronym), but you’ve still got to bear it in mind for people doing heavy numpy stuff on their work laptop or whatever.

LPisGood [3 hidden]5 mins ago
My understanding is that spawning a thread takes just a few micro seconds, so whether you’re talking about a process or a Python interpreter process there are still orders of magnitude to be gained.
jaoane [3 hidden]5 mins ago
>Spawning a PYTHON interpreter process might take 30 ms to 300 ms before you get to main(), depending on the number of imports

That's lucky. On constrained systems launching a new interpreter can very well take 10 seconds. Python is ssssslllloooowwwww.

ogrisel [3 hidden]5 mins ago
You cannot share arbitrarily structured objects in the `ShareableList`, only atomic scalars and bytes / strings.

If you want to share structured Python objects between instances, you have to pay the cost of `pickle.dump/pickle.dump` (CPU overhead for interprocess communication) + the memory cost of replicated objects in the processes.

sgarland [3 hidden]5 mins ago
So don’t do that? Send data to workers as primitives, and have a separate process that reads the results and serializes it into whatever form you want.
notpushkin [3 hidden]5 mins ago
We need a dataclass-like interface on top of a ShareableList.
tomrod [3 hidden]5 mins ago
I can fit a lot of json into bytes/strings though?
frollogaston [3 hidden]5 mins ago
If all your state is already json-serializable, yeah. But that's just as expensive as copying if not more, hence what cjbgkagh said about flatbuffers.
frollogaston [3 hidden]5 mins ago
oh nvm, that doesn't solve this either
cjbgkagh [3 hidden]5 mins ago
Perhaps flatbuffers would be better?
tomrod [3 hidden]5 mins ago
I love learning from folks on HN -- thanks! Will check it out.
notpushkin [3 hidden]5 mins ago
Take a look at https://capnproto.org/ as well, while at it.

Neither solve the copying problem, though.

frollogaston [3 hidden]5 mins ago
Ah, I forgot capnproto doesn't let you edit a serialized proto in-memory, it's read-only. In theory this should be possible as long as you're not changing the length of anything, but I'm not surprised such trickery is unsupported.

So this doesn't seem like a versatile solution for sharing data structs between two Python processes. You're gonna have to reserialize the whole thing if one side wants to edit, which is basically copying.

tinix [3 hidden]5 mins ago
let me introduce you to quickle.
reliabilityguy [3 hidden]5 mins ago
What’s the point? The whole idea is to share an object, and not to serialize them whether it’s json, pickle, or whatever.
tomrod [3 hidden]5 mins ago
I mean, the answer to this is pretty straightforward -- because we can, not because we should :)
vlovich123 [3 hidden]5 mins ago
That’s even worse than pickle.
tomrod [3 hidden]5 mins ago
pickle pickles to pickle binary, yeah? So can stream that too with an io Buffer :D
modeless [3 hidden]5 mins ago
Yeah I've had great success sharing numpy arrays this way. Explicit sharing is not a huge burden, especially when compared with the difficulty of debugging problems that occur when you accidentally share things between threads. People vastly overstate the benefit of threads over multiprocessing and I don't look forward to all the random segfaults I'm going to have to debug after people start routinely disabling the GIL in a library ecosystem that isn't ready.

I wonder why people never complained so much about JavaScript not having shared-everything threading. Maybe because JavaScript is so much faster that you don't have to reach for it as much. I wish more effort was put into baseline performance for Python.

zahlman [3 hidden]5 mins ago
> I wish more effort was put into baseline performance for Python.

There has been. That's why the bytecode is incompatible between minor versions. It was a major selling(?) point for 3.11 and 3.12 in particular.

But the "Faster CPython" team at Microsoft was apparently just laid off (https://www.linkedin.com/posts/mdboom_its-been-a-tough-coupl...), and all of the optimization work has to my understanding been based around fairly traditional techniques. The C part of the codebase has decades of legacy to it, after all.

Alternative implementations like PyPy often post impressive results, and are worth checking out if you need to worry about native Python performance. Not to mention the benefits of shifting the work onto compiled code like NumPy, as you already do.

frollogaston [3 hidden]5 mins ago
"I wonder why people never complained so much about JavaScript not having shared-everything threading"

Mainly cause Python is often used for data pipelines in ways that JS isn't, causing situations where you do want to use multiple CPU cores with some shared memory. If you want to use multiple CPU cores in NodeJS, usually it's just a load-balancing webserver without IPC and you just use throng, or maybe you've got microservices.

Also, JS parallelism simply excelled from the start at waiting on tons of IO, there was no confusion about it. Python later got asyncio for this, and by now regular threads have too much momentum. Threads are the worst of both worlds in Py, cause you get the overhead of an OS thread and the possibility of race conditions without the full parallelism it's supposed to buy you. And all this stuff is confusing to users.

com2kid [3 hidden]5 mins ago
> I wonder why people never complained so much about JavaScript not having shared-everything threading. Maybe because JavaScript is so much faster that you don't have to reach for it as much. I wish more effort was put into baseline performance for Python.

Nobody sane tries to do math in JS. Backend JS is recommended for situations where processing is minimal and it is mostly lots of tiny IO requests that need to be shunted around.

I'm a huge JS/Node proponent and if someone says they need to write a backend service that crunches a lot of numbers, I'll recommend choosing a different technology!

For some reason Python peeps keep trying to do actual computations in Python...

frollogaston [3 hidden]5 mins ago
Python peeps tend to do heavy numbers calc in numpy, but sometimes you're doing expensive things with dictionaries/lists.
dhruvrajvanshi [3 hidden]5 mins ago
> I wonder why people never complained so much about JavaScript not having shared-everything threading. Maybe because JavaScript is so much faster that you don't have to reach for it as much. I wish more effort was put into baseline performance for Python.

This is a fair observation.

I think a part of the problem is that the things that make GIL less python hard are also the things that make faster baseline performance hard. I.e. an over reliance of the ecosystem on the shape of the CPython data structures.

What makes python different is that a large percentage of python code isn't python, but C code targeting the CPython api. This isn't true for a lot of other interpreted languages.

isignal [3 hidden]5 mins ago
Processes can die independently so the state of a concurrent shared memory data structure when a process dies while modifying this under a lock can be difficult to manage. Postgres which uses shared memory data structures can sometimes need to kill all its backend processes because it cannot fully recover from such a state.

In contrast, no one thinks about what happens if a thread dies independently because the failure mode is joint.

wongarsu [3 hidden]5 mins ago
> In contrast, no one thinks about what happens if a thread dies independently because the failure mode is joint.

In Rust if a thread holding a mutex dies the mutex becomes poisoned, and trying to acquire it leads to an error that has to be handled. As a consequence every rust developer that touches a mutex has to think about that failure mode. Even if in 95% of cases the best answer is "let's exit when that happens".

The operating system tends to treat your whole process as one and shot down everything or nothing. But a thread can still crash in its own due to unhandled oom, assertion failures or any number of other issues

jcalvinowens [3 hidden]5 mins ago
> But a thread can still crash in its own due to unhandled oom, assertion failures or any number of other issues

That's not really true on POSIX. Unless you're doing nutty things with clone(), or you actually have explicit code that calls pthread_exit() or gettid()/pthread_kill(), the whole process is always going to die at the same time.

POSIX signal dispositions are process-wide, the only way e.g. SIGSEGV kills a single thread is if you write an explicit handler which actually does that by hand. Unhandled exceptions usually SIGABRT, which works the same way.

** Just to expand a bit: there is a subtlety in that, while dispositions are process-wide, one individual thread does indeed take the signal. If the signal is handled, only that thread sees -EINTR from a blocking syscall; but if the signal is not handled, the default disposition affects all threads in the process simultaneously no matter which thread is actually signalled.

wahern [3 hidden]5 mins ago
It would be nice if someday we got per-thread signal handlers to complement per-thread signal masking and per-thread alternate signal stacks.
jcalvinowens [3 hidden]5 mins ago
This is a solvable problem though, the literature is overflowing with lock-free implementations of common data structures. The real question is how much performance you have to sacrifice for the guarantee...
tinix [3 hidden]5 mins ago
shared memory only works on dedicated hardware.

if you're running in something like AWS fargate, there is no shared memory. have to use the network and file system which adds a lot of latency, way more than spawning a process.

copying processes through fork is a whole different problem.

green threads and an actor model will get you much further in my experience.

bradleybuda [3 hidden]5 mins ago
Fargate is just a container runtime. You can fork processes and share memory like you can in any other Linux environment. You may not want to (because you are running many cheap / small containers) but if your Fargate containers are running 0.25 vCPUs then you probably don't want traditional multiprocessing or multithreading...
tinix [3 hidden]5 mins ago
Go try it and report back.

Fargate isn't just ECS and plain containers.

You cannot use shared memory in fargate, there is literally no /dev/shm.

See "sharedMemorySize" here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/...

> If you're using tasks that use the Fargate launch type, the sharedMemorySize parameter isn't supported.

sgarland [3 hidden]5 mins ago
Well don’t use Fargate, there’s your problem. Run programs on actual servers, not magical serverless bullshit.
pansa2 [3 hidden]5 mins ago
Does removal of the GIL have any other effects on multi-threaded Python code (other than allowing it to run in parallel)?

My understanding is that the GIL has lasted this long not because multi-threaded Python depends on it, but because removing it:

- Complicates the implementation of the interpreter

- Complicates C extensions, and

- Causes single-threaded code to run slower

Multi-threaded Python code already has to assume that it can be pre-empted on the boundary between any two bytecode instructions. Does free-threaded Python provide the same guarantees, or does it require multi-threaded Python to be written differently, e.g. to use additional locks?

rfoo [3 hidden]5 mins ago
> Does free-threaded Python provide the same guarantees

Mostly. Some of the "can be pre-empted on the boundary between any two bytecode instructions" bugs are really hard to hit without free-threading, though. And without free-threading people don't use as much threading stuff. So by nature it exposes more bugs.

Now, my rants:

> have any other effects on multi-threaded Python code

It stops people from using multi-process workarounds. Hence, it simplifies user-code. IMO totally worth it to make the interpreter more complex.

> Complicates C extensions

The alternative (sub-interpreters) complicates C extensions more than free-threading and the top one most important C extension in the entire ecosystem, numpy, stated that they can't and they don't want to support sub-interpreters. On contrary, they already support free-threading today and are actively sorting out remaining bugs.

> Causes single-threaded code to run slower

That's the trade-off. Personally I think a single digit percentage slow-down of single-threaded code worth it.

celeritascelery [3 hidden]5 mins ago
> That's the trade-off. Personally I think a single digit percentage slow-down of single-threaded code worth it.

Maybe. I would expect that 99% of python code going forward will still be single threaded. You just don’t need that extra complexity for most code. So I would expect that python code as a whole will have worse performance, even though a handful of applications will get faster.

rfoo [3 hidden]5 mins ago
That's the mindset that leads to the funny result that `uv pip` is like 10x faster than `pip`.

Is it because Rust is just fast? Nope. For anything after resolving dependency versions raw CPU performance doesn't matter at all. It's writing concurrent PLUS parallel code in Rust is easier, doesn't need to spawn a few processes and wait for the interpreter to start in each, doesn't need to serialize whatever shit you want to run constantly. So, someone did it!

Yet, there's a pip maintainer who actively sabotages free-threading work. Nice.

notpushkin [3 hidden]5 mins ago
> Yet, there's a pip maintainer who actively sabotages free-threading work.

Wow. Could you elaborate?

foresto [3 hidden]5 mins ago
As I recall, CPython has also been getting speed-ups lately, which ought to make up for the minor single-threaded performance loss introduced by free threading. With that in mind, the recent changes seem like an overall win to me.
pphysch [3 hidden]5 mins ago
But the bar to parallelizing code gets much lower, in theory. Your serial code got 5% slower but has a direct path to being 50% faster.

And if there's a good free-threaded HTTP server implementation, the RPS of "Python code as a whole" could increase dramatically.

fjasdfas [3 hidden]5 mins ago
You can do multiple processes with SO_REUSEPORT.

free-threaded makes sense if you need shared state.

pphysch [3 hidden]5 mins ago
Any webserver that wants to cache and reuse content cares about shared state, but usually has to outsource that to a shared in-memory database because the language can't support it.
weakfish [3 hidden]5 mins ago
Is there any news from FastAPI folks and/or Gunicorn on their support?
rocqua [3 hidden]5 mins ago
Note that there is an entire order of magnitude range for a 'single digit'.

A 1% slowdown seems totally fine. A 9% slowdown is pretty bad.

jacob019 [3 hidden]5 mins ago
Your understanding is correct. You can use all the cores but it's much slower per thread and existing libraries may need to be reworked. I tried it with PyTorch, it used 10x more CPU to do half the work. I expect these issues to improve, still great to see after 20 years wishing for it.
btilly [3 hidden]5 mins ago
It makes race conditions easier to hit, and that will require multi-threaded Python to be written with more care to achieve the same level of reliability.
pjmlp [3 hidden]5 mins ago
On the other news, Microsoft dumped the whole faster Python team, apparently the 2025 earnings weren't enough to keep the team around.

https://www.linkedin.com/posts/mdboom_its-been-a-tough-coupl...

Lets see whatever performance improvements still land on CPython, unless other company sponsors the work.

I guess Facebook (no need to correct me on the name) is still sponsoring part of it.

bgwalter [3 hidden]5 mins ago
They were quite a bit behind the schedule that was promised five years ago.

Additionally, at this stage the severe political and governance problems cannot have escaped Microsoft. I imagine that no competent Microsoft employee wants to give his expertise to CPython, only later to suffer group defamation from a couple of elected mediocre people.

CPython is an organization that overpromises, allocates jobs to the obedient and faithful while weeding out competent dissenters.

It wasn't always like that. The issues are entirely self-inflicted.

make3 [3 hidden]5 mins ago
Microsoft also fired a whole lot of other open source people unrelated to Python in this current layoff
pjmlp [3 hidden]5 mins ago
Notably MAUI, ASP.NET, Typescript and AI frameworks.
biorach [3 hidden]5 mins ago
> CPython is an organization that overpromises, allocates jobs to the obedient and faithful while weeding out competent dissenters.

This stinks of BS

wisty [3 hidden]5 mins ago
It sounds like an oblique reference to that time they temporarily suspended one of the of the most valuable members of the community, apparently for having the audacity to suggest that their powers to suspend members of the community seemed a little arbitrary and open to abuse.
biorach [3 hidden]5 mins ago
Well they could just say that instead of wasting people's time with oblique references
robertlagrant [3 hidden]5 mins ago
Saying "This stinks of BS" is going to mean you have little standing to criticise other people for wasting time.
vlovich123 [3 hidden]5 mins ago
That’s unfortunate but I called it when people were claiming that Microsoft had committed to this effort for the long term.
mtzaldo [3 hidden]5 mins ago
Could we do a crowdfunding campaign so we can keep paying them? The whole world is/will benefit from their work.
morkalork [3 hidden]5 mins ago
Didn't Google lay off their entire Python development team in the last year as well? I wonder if there is some impetus behind both.
make3 [3 hidden]5 mins ago
doesn't print money right away = cut by executive #3442
rich_sasha [3 hidden]5 mins ago
Ah that's very, very sad. I guess they have embraced and extended, there's only one thing left to do.
stusmall [3 hidden]5 mins ago
That shows a misunderstanding of what EEE was. This team was sending changes upstream which is the exact opposite of "extend" step of the strategy. The idea of "extend" was to add propriety extensions on top of an open standard/project locking customers into the MSFT implementation.
jerrygenser [3 hidden]5 mins ago
Ok so a better example of what you describe might be vscode.
nothrabannosir [3 hidden]5 mins ago
What existing open standard did vscode Embrace? I thought Microsoft created v0 themselves.

A classic example is ActiveX.

biorach [3 hidden]5 mins ago
> A classic example is ActiveX.

Nah, even that was based on earlier MS technologies - OLE and COM

A good starter list of EEE plays is on the wikipedia page: https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...

nothrabannosir [3 hidden]5 mins ago
Funny you linked that page because that’s where I got activex from :D

> Examples by Microsoft

> Browser incompatibilities

> The plaintiffs in an antitrust case claimed Microsoft had added support for ActiveX controls in the Internet Explorer Web browser to break compatibility with Netscape Navigator, which used components based on Java and Netscape's own plugin system.

biorach [3 hidden]5 mins ago
ah ok, sorry. I thought you were saying that they tried an EEE play on ActiveX.

You meant they used ActiveX in an EEE play in the browser wars.

nothrabannosir [3 hidden]5 mins ago
Honestly I kept it vague because I didn't actually know so your call-out was totally valid. I know it better now than without your clarification so thanks :+1:
JacobHenner [3 hidden]5 mins ago
VSCode displaced Atom, pre-GitHub acquisition, by building on top of Atom's rendering engine Electron.
biorach [3 hidden]5 mins ago
At this stage the cliched and clueless comments about embrace/extend/extinguish are tiresome and inevitable whenever Microsoft is mentioned.

A few decades ago MS did indeed have a playbook which they used to undermine open standards. Laying off some members of the Python team bears no resemblence whatsoever to that. At worst it will delay the improvement of free-threaded Python. That's all.

Your comment is lazy and unfounded.

kstrauser [3 hidden]5 mins ago
cough Bullshit cough

* VSCode got popular and they started preventing forks from installing its extensions.

* They extended the Free Source pyright language server into the proprietary pylance. They don’t even sell it. It’s just there to make the FOSS version less useful.

* They bought GitHub and started rate limiting it to unlogged in visitors.

Every time Microsoft touches a thing, they end up locking it down. They can’t help it. It’s their nature. And if you’re the frog carrying that scorpion across the pond and it stings you, well, you can only blame it so much. You knew this when they offered the deal.

Every time. It hasn’t changed substantially since they declared that Linux is cancer, except to be more subtle in their attacks.

biorach [3 hidden]5 mins ago
None of those were independent projects or open standards. VScode and pyright are both MS projects from the get-go.

Sabotaging forks is scummy, but the forks were extending MS functionality, not the other way around.

GitHub was a private company before it was bought by MS. Rate limiting is.... not great, but certainly not an extinguish play.

EEE refers to the subversion of open standards or independent free software projects. It does not apply to any of the above.

MS are still scummy but at least attack them on their own demerits, and don't parrot some schtick from decades ago.

kstrauser [3 hidden]5 mins ago
It’s not just EEE, though. They have a history of getting devs all in on a thing and then killing it with corporate-grade ADHD. They bought Visual FoxPro, got bored with it, and told everyone to rewrite into Visual Basic (which they then killed). Then the future was Silverlight, until it wasn’t. There are a thousand of these things that weren’t deliberately evil in the EEE, but defined the word rugpull before we called it that.

So even without EEE, I think it’s supremely risky to hitch your wagon to their tech or services (unless you’re writing primarily for Windows, which is what they’d love to help you migrate to). And I can’t be convinced the GitHub acquisition wasn’t some combination of these dark patterns.

Step 1: Get a plurality of the world’s FOSS into one place.

Step 2: Feed it into a LLM and then embed it in a popular free editor so that everyone can use GPL code without actually having to abide the license.

Step 3: Make it increasingly hard to use for FOSS development by starting to add barriers a little at a time. <= we are here

As a developer, they’ve done nothing substantial to earn my trust. I think a lot of Microsoft employees are good people who don’t subscribe to all this and who want to do the right thing, but corporate culture just won’t let that be.

biorach [3 hidden]5 mins ago
> I think it’s supremely risky to hitch your wagon to their tech or services

OK, finally, yes, this is very true, for specific parts of their tech.

But banging on about EEE just distracts from this, more important message.

> Make it increasingly hard to use for FOSS development by starting to add barriers a little at a time. <= we are here

....and now you've lost me again

kstrauser [3 hidden]5 mins ago
Note I wasn’t the one who said EEE upstream. I was just replying to the thread.

Hanlon’s razor is a thing, and I generally follow it. It’s just that I’ve seen Microsoft make so many “oops, our bad!” mistakes over the years that purely coincidentally gave them an edge up over their competition, that I tend to distrust such claims from them.

I don’t feel that way about all corps. Oracle doesn’t make little mistakes that accidentally harm the competition while helping themselves. No, they’ll look you in the eye and explain that they’re mugging you while they take your wallet. It’s kind of refreshingly honest in its own way.

dhruvrajvanshi [3 hidden]5 mins ago
> Oracle doesn’t make little mistakes that accidentally harm the competition while helping themselves. No, they’ll look you in the eye and explain that they’re mugging you while they take your wallet. It’s kind of refreshingly honest in its own way.

Fucking hell bud :D

kstrauser [3 hidden]5 mins ago
Tell me I'm wrong! :D
oblio [3 hidden]5 mins ago
I actually hate this trope more because of what is says about the poster. Which I guess would, that they're someone wearing horse blinders.

There's a part of me that wants to scream at them:

"Look around you!!! It's not 1999 anymore!!! These days we have Google, Amazon, Apple, Facebook, etc, which are just as bad if not worse!!! Cut it out with the 20+ year old bad jokes!!!"

Yes, Microsoft is bad. The reason Micr$oft was the enemy back in the day is because they... won. They were bigger than anyone else in the fields that mattered (except for server-side, where they almost one). Now they're just 1 in a gang of evils. There's nothing special about them anymore. I'm more scared of Apple and Google.

kstrauser [3 hidden]5 mins ago
That’s only reasonable if you believe you can only distrust one company at a time. I distrust every one you mentioned there, for different reasons, in different ways. I don’t think that Apple is trying to exclusively own the field of programming tools to their own profit, nor do I think that Facebook is. I don’t think Apple is trying to own all data about every human. I don’t think Microsoft is trying to force all vendors to sell through their app store.

But the thing is that Microsoft hasn’t seemed to fundamentally change since 1999. They appear kinder and friendlier but they keep running the same EEE playbook everywhere they can. Lots of us give them a free pass because they let us run a nifty free-for-now programming editor. That doesn’t change the leopard’s spots, though.

mixmastamyk [3 hidden]5 mins ago
All these posts and no one mentioned their numerous, recent, abusive deeds around Windows or negligent security posture, all the while having captured Uncle Sam and other governments.

MS has continued to metastasize and is in some ways worse than the old days, even if they’ve finally accepted the utility of open source as a loss leader.

They have the only BigTech products I’ve been forced to use if I want to eat.

oblio [3 hidden]5 mins ago
Yet I only ever see these tired EEE memes for Microsoft when Chrome is basically the web, for example.
kstrauser [3 hidden]5 mins ago
I don’t know what to tell you, except that you obviously haven’t read a lot of my stuff on that topic. (Not that I would expect anyone to have, mind you. I’m nobody.) I agree with you. I only use Chrome when I must, like when I’m updating a Meshtastic radio and the flasher app doesn’t run on Firefox or Safari.

I’m not anti-MS as much as anti their behavior, whoever is acting that way. This thread is directly related to MS so I’m expressing my opinion on MS here. I’ll be more than happy to share my thoughts on Chrome in a Google thread.

falcor84 [3 hidden]5 mins ago
It wouldn't have bothered me if you just said "Facebook" - I probably wouldn't have even noticed it. But I'm really curious why you chose to write "Facebook", then apparently noticed the issue, and instead of replacing it with "Meta" decided to add the much longer "(no need to correct me on the name)". What axe are you grinding?
pjmlp [3 hidden]5 mins ago
Yes, because I am quite certain someone without anything better to do would correct me on that.

For me Facebook will always be Facebook, and Twitter will always be Twitter.

rbanffy [3 hidden]5 mins ago
> Twitter will always be Twitter.

If Elon can deadname his daughter, then we can deadname his company.

kstrauser [3 hidden]5 mins ago
That’s the rationale I’ve been using.
falcor84 [3 hidden]5 mins ago
> Yes, because I am quite certain someone without anything better to do would correct me on that.

Well, you sure managed to avoid that by setting up camp on that hill. Kudos on so much time saved.

> For me Facebook will always be Facebook, and Twitter will always be Twitter.

Well, for me the product will always be "Thefacebook", but that's since I haven't used it since. But I do respect that there's a company running it now that does more stuff and contributes to open source projects.

biorach [3 hidden]5 mins ago
> Well, you sure managed to avoid that by setting up camp on that hill. Kudos on so much time saved.

Why are you picking a fight about this?

falcor84 [3 hidden]5 mins ago
I think I'm taking it personally because I had previously changed my name and had people repeatedly call me by my old name just to annoy/hurt me.

Obviously I know that companies aren't people and don't have feelings, but I can't understand why you would intentionally avoid using their chosen name, even when it's more effort to you.

kstrauser [3 hidden]5 mins ago
I wouldn’t do that to a person. I’m not worried about hurting Twitter’s feelings, though.
Flamentono2 [3 hidden]5 mins ago
With money which destroied our society
AlexanderDhoore [3 hidden]5 mins ago
Am I the only one who sort of fears the day when Python loses the GIL? I don't think Python developers know what they’re asking for. I don't really trust complex multithreaded code in any language. Python, with its dynamic nature, I trust least of all.
jillesvangurp [3 hidden]5 mins ago
You are not the only one who is afraid of changes and a bit change resistant. I think the issue here is that the reasons for this fear are not very rational. And also the interest of the wider community is to deal with technical debt. And the GIL is pure technical debt. Defensible 30 years ago, a bit awkward 20 years ago, and downright annoying and embarrassing now that world + dog does all their AI data processing with python at scale for the last 10. It had to go in the interest of future proofing the platform.

What changes for you? Nothing unless you start using threads. You probably weren't using threads anyway because there is little to no point in python to using them. Most python code bases completely ignore the threading module and instead use non blocking IO, async, or similar things. The GIL thing only kicks in if you actually use threads.

If you don't use threads, removing the GIL changes nothing. There's no code that will break. All those C libraries that aren't thread safe are still single threaded, etc. Only if you now start using threads do you need to pay attention.

There's some threaded python code of course that people may have written in python somewhat naively in the hope that it would make things faster that is constantly hitting the GIL and is effectively single threaded. That code now might run a little faster. And probably with more bugs because naive threaded code tends to have those.

But a simple solution to address your fears: simply don't use threads. You'll be fine.

Or learn how to use threads. Because now you finally can and it isn't that hard if you have the right abstractions. I'm sure those will follow in future releases. Structured concurrency is probably high on the agenda of some people in the community.

dkarl [3 hidden]5 mins ago
> What changes for you? Nothing unless you start using threads

Coming from the Java world, you don't know what you're missing. Looking inside an application and seeing a bunch of threadpools managed by competing frameworks, debugging timeouts and discovering that tasks are waiting more than a second to get scheduled on the wrong threadpool, tearing your hair out because someone split a tiny sub-10μs bit of computation into two tasks and scheduling the second takes a hundred times longer than the actual work done, adding a library for a trivial bit of functionality and discovering that it spins up yet another threadpool when you initialize it.

(I'm mostly being tongue in cheek here because I know it's nice to have threading when you need it.)

rbanffy [3 hidden]5 mins ago
> There's some threaded python code of course

A fairly common pattern for me is to start a terminal UI updating thread that redraws the UI every second or so while one or more background threads do their thing. Sometimes, it’s easier to express something with threads and we do it not to make the process faster (we kind of accept it will be a bit slower).

The real enemy is state that can me mutated from more than one place. As long as you know who can change what, threads are not that scary.

HDThoreaun [3 hidden]5 mins ago
> But a simple solution to address your fears: simply don't use threads. You'll be fine.

Im not worried about new code. Im worried about stuff written 15 years ago by a monkey who had no idea how threads work and just read something on stack overflow that said to use threading. This code will likely break when run post-GIL. I suspect there is actually quite a bit of it.

bayindirh [3 hidden]5 mins ago
Software rots, software tools evolve. When Intel released performance primitives libraries which required recompilation to analyze multi-threaded libraries, we were amazed. Now, these tools are built into processors as performance counters and we have way more advanced tools to analyze how systems behave.

Older code will break, but they break all the time. A language changes how something behaves in a new revision, suddenly 20 year old bedrock tools are getting massively patched to accommodate both new and old behavior.

Is it painful, ugly, unpleasant? Yes, yes and yes. However change is inevitable, because some of the behavior was rooted in inability to do some things with current technology, and as hurdles are cleared, we change how things work.

My father's friend told me that length of a variable's name used to affect compile/link times. Now we can test whether we have memory leaks in Rust. That thing was impossible 15 years ago due to performance of the processors.

zahlman [3 hidden]5 mins ago
> A language changes how something behaves in a new revision, suddenly 20 year old bedrock tools are getting massively patched to accommodate both new and old behavior.

In my estimation, the only "20 year old bedrock tools" in Python are in the standard library - which currently holds itself free to deprecate entire modules in any minor version, and remove them two minor versions later - note that this is a pseudo-calver created by a coincidentally annual release cadence. (A bunch of stuff that old was taken out recently, but it can't really be considered "bedrock" - see https://peps.python.org/pep-0594/).

Unless you include NumPy's predecessors when dating it (https://en.wikipedia.org/wiki/NumPy#History). And the latest versions of NumPy don't even support Python 3.9 which is still not EOL.

Requests turns 15 next February (https://pypi.org/project/requests/#history).

Pip isn't 20 years old yet (https://pypi.org/project/pip/#history) even counting the version 0.1 "pyinstall" prototype (not shown).

Setuptools (which generally supports only the Python versions supported by CPython, hasn't supported Python 2.x since version 45 and is currently on version 80) only appears to go back to 2006, although I can't find release dates for versions before what's on PyPI (their own changelog goes back to 0.3a1, but without dates).

spookie [3 hidden]5 mins ago
The other day I compiled a 1989 C program and it did the job.

I wish more things were like that. Tired of building things on shaky grounds.

rbanffy [3 hidden]5 mins ago
If you go into mainframes, you'll compile code that was written 50 years ago without issue. In fact, you'll run code that was compiled 50 years ago and all that'll happen is that it'll finish much sooner than it did on the old 360 it originally ran on.
cestith [3 hidden]5 mins ago
My only concern is this kind of change in semantics for existing syntax is more worthy of a major revision than a point release.
rbanffy [3 hidden]5 mins ago
It's opt-in at the moment. It won't be the default behavior for a couple releases.

Maybe we'll get Python 4 with no GIL.

/me ducks

delusional [3 hidden]5 mins ago
> Software rots

No it does not. I hate that analogy so much because it leads to such bad behavior. Software is a digital artifact that can does not degrade. With the right attitude, you'd be able to execute the same binary on new machines for as long as you desired. That is not true of organic matter that actually rots.

The only reason we need to change software is that we trade that off against something else. Instructions are reworked, because chasing the universal Turing machine takes a few sacrifices. If all software has to run on the same hardware, those two artifacts have to have a dialogue about what they need from each other.

If we didnt want the universal machine to do anything new. If we had a valuable product. We could just keep making the machine that executes that product. It never rots.

zahlman [3 hidden]5 mins ago
>execute the same binary

Only if you statically compile or don't upgrade your dependencies. Or don't allow your dependencies to innovate.

dahcryn [3 hidden]5 mins ago
yes it does.

If software is implicitly built on wrong understanding, or undefined behaviour, I consider it rotting when it starts to fall apart as those undefined behaviours get defined. We do not need to sacrifice a stable future because of a few 15 year old programs. Let the people who care about the value that those programs bring, manage the update cycle and fix it.

eblume [3 hidden]5 mins ago
Software is written with a context, and the context degrades. It must be renewed. It rots, sorry.
igouy [3 hidden]5 mins ago
You said it's the context that rots.
bayindirh [3 hidden]5 mins ago
It's a matter of perspective, I guess...

When you look from the program's perspective, the context changes and becomes unrecognizable, IOW, it rots.

When you look from the context's perspective, the program changes by not evolving and keeping up with the context, IOW, it rots.

Maybe we anthropomorphize both and say "they grow apart". :)

igouy [3 hidden]5 mins ago
We say the context has breaking changes.

We say the context is not backwards compatible.

indymike [3 hidden]5 mins ago
>> Software rots > No it does not.

I'm thankful that it does, or I would have been out of work long ago. It's not that the files change (literal rot), it is that hardware, OSes, libraries, and everything else changes. I'm also thankful that we have not stopped innovating on all of the things the software I write depends on. You know, another thing changes - what we are using the software for. The accounting software I wrote in the late 80s... would produce financial reports that were what was expected then, but would not meet modern GAAP requirements.

rocqua [3 hidden]5 mins ago
Fair point, but there is an interesting question posed.

Software doesn't rot, it remains constant. But the context around it changes, which means it loses usefulness slowly as time passes.

What is the name for this? You could say 'software becomes anachronistic'. But is there a good verb for that? It certainly seems like something that a lot more than just software experiences. Plenty of real world things that have been perfectly preserved are now much less useful because the context changed. Consider an Oxen-yoke, typewriters, horse-drawn carriages, envelopes, phone switchboards, etc.

It really feels like this concept should have a verb.

igouy [3 hidden]5 mins ago
obsolescence
kstrauser [3 hidden]5 mins ago
That’s not what the phrase implies. If you have a C program from 1982, you can still compile it on a 1982 operating system and toolchain and it’ll work just as before.

But if you tried to compile it on today’s libc, making today’s syscalls… good luck with that.

Software “rots” in the sense that it has to be updated to run on today’s systems. They’re a moving target. You can still run HyperCard on an emulator, but good luck running it unmodded on a Mac you buy today.

zahlman [3 hidden]5 mins ago
> You can still run HyperCard on an emulator, but good luck running it unmodded on a Mac you buy today.

I grew up with HyperCard, so I had a moment of sadness here.

bgwalter [3 hidden]5 mins ago
If it is C-API code: Implicit protection of global variables by the GIL is a documented feature, which makes writing extensions much easier.

Most C extensions that will break are not written by monkeys, but by conscientious developers that followed best practices.

actinium226 [3 hidden]5 mins ago
If code has been unmaintained for more than a few years, it's usually such a hassle to get it working again that 99% of the time I'll just write my own solution, and that's without threads.

I feel some trepidation about threads, but at least for debugging purposes there's only one process to attach to.

zahlman [3 hidden]5 mins ago
>Im worried about stuff written 15 years ago

Please don't - it isn't relevant.

15 years ago, new Python code was still dominantly for 2.x. Even code written back then with an eye towards 3.x compatibility (or, more realistically, lazily run through `2to3` or `six`) will have quite little chance of running acceptably on 3.14 regardless. There have been considerable removals from the standard library, `async` is no longer a valid identifier name (you laugh, but that broke Tensorflow once). The attitude taken towards """strings""" in a lot of 2.x code results in constructs that can be automatically made into valid syntax that appears to preserve the original intent, but which are not at all automatically fixed.

Also, the modern expectation is of a lock-step release cadence. CPython only supports up to the last 5 versions, released annually; and whenever anyone publishes a new version of a package, generally they'll see no point in supporting unsupported Python versions. Nor is anyone who released a package in the 3.8 era going to patch it if it breaks in 3.14 - because support for 3.14 was never advertised anyway. In fact, in most cases, support for 3.9 wasn't originally advertised, and you can't update the metadata for an existing package upload (you have to make a new one, even if it's just a "post-release") even if you test it and it does work.

Practically speaking, pure-Python packages usually do work in the next version, and in the next several versions, perhaps beyond the support window. But you can really never predict what's going to break. You can only offer a new version when you find out that it's going to break - and a lot of developers are going to just roll that fix into the feature development they were doing anyway, because life's too short to backport everything for everyone. (If there's no longer active development and only maintenance, well, good luck to everyone involved.)

If 5 years isn't long enough for your purposes, practically speaking you need to maintain an environment with an outdated interpreter, and find a third party (RedHat seems to be a popular choice here) to maintain it.

dhruvrajvanshi [3 hidden]5 mins ago
> Im not worried about new code. Im worried about stuff written 15 years ago by a monkey who had no idea how threads work and just read something on stack overflow that said to use threading. This code will likely break when run post-GIL. I suspect there is actually quite a bit of it.

I was with OP's point but then you lost me. You'll always have to deal with that coworker's shitty code, GIL or not.

Could they make a worse mess with multi threading? Sure. Is their single threaded code as bad anyway because at the end of the day, you can't even begin understand it? Absolutely.

But yeah I think python people don't know what they're asking for. They think GIL less python is gonna give everyone free puppies.

bayindirh [3 hidden]5 mins ago
More realistically, as it happened in ML/AI scene, the knowledgeable people will write the complex libraries and will hand these down to scientists and other less experienced, or risk-averse developers (which is not a bad thing).

With the critical mass Python acquired over the years, GIL becomes a very sore bottleneck in some cases. This is why I decided to learn Go, for example. Properly threaded (and green threaded) programming language which is higher level than C/C++, but lower than Python which allows me to do things which I can't do with Python. Compilation is another reason, but it was secondary with respect to threading.

quectophoton [3 hidden]5 mins ago
I don't want to add more to your fears, but also remember that LLMs have been trained on decades worth of Python code that assumes the presence of the GIL.
rocqua [3 hidden]5 mins ago
This could, indeed, be quite catastrophic.

I wonder if companies will start adding this to their system prompts.

zahlman [3 hidden]5 mins ago
Suppose they do. How is the LLM supposed to build a model of what will or won't break without a GIL purely from a textual analysis?

Especially when they've already been force-fed with ungodly amounts of buggy threaded code that has been mistakenly advertised as bug-free simply because nobody managed to catch the problem with a fuzzer yet (and which is more likely to expose its faults in a no-GIL environment, even though it's still fundamentally broken with a GIL)?

miohtama [3 hidden]5 mins ago
GIL or no-GIL concerns only people who want to run multicore workloads. If you are not already spending time threading or multiprocessing your code there is practically no change. Most race condition issues which you need to think are there regardless of GIL.
fulafel [3 hidden]5 mins ago
A lot of Python usage is leveraging libraries with parallel kernels inside written in other languages. A subset of those is bottlenecked on Python side speed. A sub-subset of those are people who want to try no-GIL to address the bottleneck. But if non-GIL becomes pervasive, it could mean Python becomes less safe for the "just parallel kernels" users.
kccqzy [3 hidden]5 mins ago
Yes sure. Thought experiment: what happens when these parallel kernels suddenly need to call back in to Python? Let's say you have a multithreaded sorting library. If you are sorting numbers then fine nothing changes. But if you are sorting objects you need to use a single thread because you need to call PyObject_RichCompare. These new parallel kernels will then try to call PyObject_RichCompare from multiple threads.
immibis [3 hidden]5 mins ago
With the GIL, multithreaded Python gives concurrent I/O without worrying about data structure concurrency (unless you do I/O in the middle of it) - it's a lot like async in this way - data structure manipulation is atomic between "await" expressions (except in the "await" is implicit and you might have written one without realizing in which case you have a bug). Meanwhile you still get to use threads to handle several concurrent I/O operations. I bet a lot of Python code is written this way and will start randomly crashing if the data manipulation becomes non-atomic.
OskarS [3 hidden]5 mins ago
That doesn't match with my understanding of free-threaded Python. The GIL is being replaced with fine-grained locking on the objects themselves, so sharing data-structures between threads is still going to work just fine. If you're talking about concurrency issues like this causing out-of-bounds errors:

    if len(my_list) > 5:
        print(my_list[5])
(i.e. because a different thread can pop from the list in-between the check and the print), that could just as easily happen today. The GIL makes sure that only one python interpreter runs at once, but it's entirely possible that the GIL is released and switches to a different thread after the check but before the print, so there's no extra thread-safety issue in free-threaded mode.

The problems (as I understand it, happy to be corrected), are mostly two-fold: performance and ecosystem. Using fine-grained locking is potentially much less efficient than using the GIL in the single-threaded case (you have to take and release many more locks, and reference count updates have to be atomic), and many, many C extensions are written under the assumption that the GIL exists.

rowanG077 [3 hidden]5 mins ago
Afaik the only guarantee there is, is that a bytecode instruction is atomic. Built in data structures are mostly safe I think on a per operation level. But combining them is not. I think by default every few millisecond the interpreter checks for other threads to run even if there is no IO or async actions. See `sys.getswitchinterval()`
ynik [3 hidden]5 mins ago
Bytecode instructions have never been atomic in Python's past. It was always possible for the GIL to be temporarily released, then reacquired, in the middle of operations implemented in C. This happens because C code is often manipulating the reference count of Python objects, e.g. via the `Py_DECREF` macro. But when a reference count reaches 0, this might run a `__del__` function implemented in Python, which means the "between bytecode instructions" thread switch can happen inside that reference-counting-operation. That's a lot of possible places!

Even more fun: allocating memory could trigger Python's garbage collector which would also run `__del_-` functions. So every allocation was also a possible (but rare) thread switch.

The GIL was only ever intended to protect Python's internal state (esp. the reference counts themselves); any extension modules assuming that their own state would also be protected were likely already mistaken.

rowanG077 [3 hidden]5 mins ago
Well I didn't think of this myself. It's literally what the python official doc says:

> A global interpreter lock (GIL) is used internally to ensure that only one thread runs in the Python VM at a time. In general, Python offers to switch among threads only between bytecode instructions; how frequently it switches can be set via sys.setswitchinterval(). Each bytecode instruction and therefore all the C implementation code reached from each instruction is therefore atomic from the point of view of a Python program.

https://docs.python.org/3/faq/library.html#what-kinds-of-glo...

If this is not the case please let the official python team know their documentation is wrong. It indeed does state that if Py_DECREF is invoked the bets are off. But a ton of operations never do that.

hamandcheese [3 hidden]5 mins ago
This is the nugget of information I was hoping for. So indeed even GIL threaded code today can suffer from concurrency bugs (more so than many people here seem to think).
imtringued [3 hidden]5 mins ago
You start talking about GIL and then you talk about non-atomic data manipulation, which happen to be completely different things.

The only code that is going to break because of "No GIL" are C extensions and for very obvious reasons: You can now call into C code from multiple threads, which wasn't possible before, but is now. Python code could always be called from multiple python threads even in the presence of the GIL in python.

tialaramex [3 hidden]5 mins ago
You're not the only one. David Baron's note certainly applies: https://bholley.net/blog/2015/must-be-this-tall-to-write-mul...

In a language conceived for this kind of work it's not as easy as you'd like. In most languages you're going to write nonsense which has no coherent meaning whatsoever. Experiments show that humans can't successfully understand non-trivial programs unless they exhibit Sequential Consistency - that is, they can be understood as if (which is not reality) all the things which happen do happen in some particular order. This is not the reality of how the machine works, for subtle reasons, but without it merely human programmers are like "Eh, no idea, I guess everything is computer?". It's really easy to write concurrent programs which do not satisfy this requirement in most of these languages, you just can't debug them or reason about what they do - a disaster.

As I understand it Python without the GIL will enable more programs that lose SC.

frollogaston [3 hidden]5 mins ago
What reliance did you have in mind? All sorts of calls in Python can release the GIL, so you already need locking, and there are race conditions just like in most languages. It's not like JS where your code is guaranteed to run in order until you "await" something.

I don't fully understand the challenge with removing it, but thought it was something about C extensions, not something most users have to directly worry about.

bratao [3 hidden]5 mins ago
This is a common mistake and very badly communicated. The GIL do not make the Python code thread-safe. It only protect the internal CPython state. Multi-threaded Python code is not thread-safe today.
amelius [3 hidden]5 mins ago
Well, I think you can manipulate a dict from two different threads in Python, today, without any risk of segfaults.
pansa2 [3 hidden]5 mins ago
You can do so in free-threaded Python too, right? The dict is still protected by a lock, but one that’s much more fine-grained than the GIL.
amelius [3 hidden]5 mins ago
Sounds good, yes.
porridgeraisin [3 hidden]5 mins ago
Internal cpython state also includes say, a dictionary's internal state. So for practical purposes it is safe. Of course, TOCTOU, stale reads and various race conditions are not (and can never be) protected by the GIL.
kevingadd [3 hidden]5 mins ago
This should not have been downvoted. It's true that the GIL does not make python code thread-safe implicitly, you have to either construct your code carefully to be atomic (based on knowledge of how the GIL works) or make use of mutexes, semaphores, etc. It's just memory-safe and can still have races etc.
qznc [3 hidden]5 mins ago
Worst case is probably that it is like a "Python4": Things break when people try to update to non-GIL, so they rather stay with the old version for decades.
freeone3000 [3 hidden]5 mins ago
I'm sure you'll be happy using the last language that has to fork() in order to thread. We've only had consumer-level multicore processors for 20 years, after all.
im3w1l [3 hidden]5 mins ago
You have to understand that people come from very different angles with python. Some people write web servers where in python, where speed equals money saved. Other people write little UI apps that where speed is a complete non-issue. Yet others write aiml code that spends most of its time in gpu code. But then they want to do just a little data massaging in python which can easily bottleneck the whole thing. And some people people write scripts that don't use a .env but rather os-libraries.
odiroot [3 hidden]5 mins ago
It's called job security. We'll be rewriting decades of code that's broken by that transition.
NortySpock [3 hidden]5 mins ago
I hope at least the option remains to enable the GIL, because I don't trust me to write thread-safe code on the first few attempts.
zem [3 hidden]5 mins ago
this looks extremely promising https://microsoft.github.io/verona/pyrona.html
dotancohen [3 hidden]5 mins ago
As a Python dabbler, what should I be reading to ensure my multi-threaded code in Python is in fact safe.
cess11 [3 hidden]5 mins ago
The literature on distributed systems is huge. It depends a lot on your use case what you ought to do. If you're lucky you can avoid shared state, as in no race conditions in either end of your executions.

https://www.youtube.com/watch?v=_9B__0S21y8 is fairly concise and gives some recommendations for literature and techniques, obviously making an effort in promoting PlusCal/TLA+ along the way but showcases how even apparently simple algorithms can be problematic as well as how deep analysis has to go to get you a guarantee that the execution will be bug free.

dotancohen [3 hidden]5 mins ago
My current concern is a CRUD interface that transcribes audio in the background. The transcription is triggered by user action. I need the "transcription" field disabled until the transcript is complete and stored in the database, then allow the user to edit the transcription in the UI.

Of course, while the transcription is in action the rest of the UI (Qt via Pyside) should remain usable. And multiple transcription requests should be supported - I'm thinking of a pool of transcription threads, but I'm uncertain how many to allocate. Half the quantity of CPUs? All the CPUs under 50% load?

Advise welcome!

realreality [3 hidden]5 mins ago
Use `concurrent.futures.ThreadPoolExecutor` to submit jobs, and `Future.add_done_callback` to flip the transcription field when the job completes.
ptx [3 hidden]5 mins ago
Although keep in mind that the callback will be "called in a thread belonging to the process" (say the docs), presumably some thread that is not the UI thread. So the callback needs to post an event to the UI thread's event queue, where it can be picked up by the UI thread's event loop and only then perform the UI updates.

I don't know how that's done in Pyside, though. I couldn't find a clear example. You might have to use a QThread instead to handle it.

dotancohen [3 hidden]5 mins ago
Thank you. Perhaps I should trigger the transcription thread from the UI thread, then? It is a UI button that initiates it after all.
ptx [3 hidden]5 mins ago
The tricky part is coming back onto the UI thread when the background work finishes. Your transcription thread has to somehow trigger the UI work to be done on the UI thread.

It seems the way to do it in Qt is with signals and slots, emitting a signal from your QThread and binding it to a slot in the UI thread, making sure to specify a "queued connection" [1]. There's also a lower-level postEvent method [2] but people disagree [3] on whether that's OK to call from a regular Python thread or has to be called from a QThread.

So I would try doing it with Qt's thread classes, not with concurrent.futures.

[1] https://doc.qt.io/qt-5/threads-synchronizing.html#high-level...

[2] https://doc.qt.io/qt-6/qcoreapplication.html#postEvent

[3] https://www.mail-archive.com/pyqt@riverbankcomputing.com/msg...

dotancohen [3 hidden]5 mins ago
Thank you.
sgarland [3 hidden]5 mins ago
Just use multiprocessing. If each job is independent and you aren’t trying to spread it out over multiple workers, it seems much easier and less risky to spawn a worker for each job.

Use SharedMemory to pass the data back and forth.

HDThoreaun [3 hidden]5 mins ago
Honestly unless youre willing to devote a solid 4+ hours to learning about multi threading stick with ayncio
dotancohen [3 hidden]5 mins ago
I'm willing to invest an afternoon learning. That's been the premise of my entire career!
txdv [3 hidden]5 mins ago
how does the the language being dynamic negatively affect the complexity of multithreading?
jerf [3 hidden]5 mins ago
I have a hypothesis that being dynamic has no particular effect on the complexity of multithreading. I think the apparent effect is a combination of two things: 1. All our dynamic scripting languages in modern use date from the 1990s before this degree of threading was a concern for the languages and 2. It is really hard to retrofit code written for not being threaded to work in a threaded context, and the "deeper" the code in the system the harder it is. Something like CPython is about as "deep" as you can go, so it's really, really hard.

I think if someone set out to write a new dynamic scripting language today, from scratch, that multithreading it would not pose any particular challenge. Beyond that fact that it's naturally a difficult problem, I mean, but nothing special compared to the many other languages that have implemented threading. It's all about all that code from before the threading era that's the problem, not the threading itself. And Python has a loooot of that code.

rocqua [3 hidden]5 mins ago
Dynamic(ally typed) languages, by virtue of not requiring strict typing, often lead to more complicated function signatures. Such functions are generally harder to reason about. Because they tend to require inspection of the function to see what is really going on.

Multithreaded code is incredibly hard to reason about. And reasoning about it becomes a lot easier if you have certain guarantees (e.g. this argument / return value always has this type, so I can always do this to it). Code written in dynamic languages will more often lack such guarantees, because of the complicated signatures. This makes it even harder to reason about Multithreaded code, increasing the risk posed by multithreaded code.

nottorp [3 hidden]5 mins ago
Is there so much legacy python multithreaded code anyway?

Considering everyone knew about the GIL, I'm thinking most people just wouldn't bother.

toxik [3 hidden]5 mins ago
There is, and what's worse, it assumes a global lock will keep things synchronized.
rowanG077 [3 hidden]5 mins ago
Does it? The GIL only ensured each interpreter instruction is atomic. But any group of instruction is not protected. This makes it very hard to rely on the GIL for synchronization unless you really know what you are doing.
immibis [3 hidden]5 mins ago
AFAIK a group of instructions is only non-protected if one of the instructions does I/O. Explicit I/O - page faults don't count.
kfrane [3 hidden]5 mins ago
If I understand that correctly, it would mean that running a function like this on two threads f(1) and f(2) would produce a list of 1 and 2 without interleaving.

  def f(x):
      for _ in range(N):
          l.append(x)
I've tried it out and they start interleaving when N is set to 1000000.
breadwinner [3 hidden]5 mins ago
When the language is dynamic there is less rigor. Statically checked code is more likely to be correct. When you add threads to "fast and loose" code things get really bad.
jaoane [3 hidden]5 mins ago
Unless your claim is that the same error can happen more times per minute because threading can execute more code in the same timespan, this makes no sense.
breadwinner [3 hidden]5 mins ago
Some statically checked languages and tools can catch potential data races at compile time. Example: Rust's ownership and borrowing system enforces thread safety at compile time. Statically typed functional languages like Haskell or OCaml encourage immutability, which reduces shared mutable state — a common source of concurrency bugs. Statically typed code can enforce usage of thread-safe constructs via types (e.g., Sync/Send in Rust or ConcurrentHashMap in Java).
almostgotcaught [3 hidden]5 mins ago
Do you understand what you're implying?

"Python programmers are so incompetent that Python succeeds as a language only because it lacks features they wouldn't know to use"

Even if it's circumstantially true, doesn't mean it's the right guiding principle for the design of the language.

DHolzer [3 hidden]5 mins ago
I was thinking that too. I am really not a professional developer though.

OFC it would be nice to just write python and everything would be 12x accelerated, but i don't see how there would not be any draw-backs that would interfere with what makes python so approachable.

heybrendan [3 hidden]5 mins ago
I am a Python user, but far from an expert. Occasionally, I've used 'concurrent.futures' to kick off running some very simple functions, at the same time.

How are 'concurrent.futures' users impacted? What will I need to change moving forward?

rednafi [3 hidden]5 mins ago
It’s going to get faster since threads won’t be locked on GIL. If you’re locking shared objects correctly or not using them all, then you should be good.
YouWhy [3 hidden]5 mins ago
Hey, I've been developing professionally with Python for 20 years, so wanted to weigh in:

Decent threading is awesome news, but it only affects a small minority of use cases. Threads are only strictly necessary when it's prohibitive to message pass. The Python ecosystem these days includes a playbook solution for literally any such case. Considering the multiple major pitfalls of threads (i.e., locking), they are likely to become a thing useful only in specific libraries/domains and not as a general.

Additionally, with all my love to vanilla Python, anyone who needs to squeeze the juice out of their CPU (which is actually memory bandwidth) has a plenty of other tools -- off the shelf libraries written in native code. (Honorable mention to Pypy, numba and such).

Finally, the one dramatic performance innovation in Python has been async programming - I warmly encourage everyone not familiar with it to consider taking a look.

kstrauser [3 hidden]5 mins ago
I haven’t been using it that much longer than you, and I agree with most of what you’re saying, but I’d characterize it differently.

Python has a lot of solid workarounds for avoid threading because until now Python threading has absolutely sucked. I had naively tried to use it to make a CPU-bound workload twice as fast and soon realized the implications of the GIL, so I threw all that code away and made it multiprocessing instead. That sucked in its own way because I had to serialize lots of large data structures to pass around, so 2x the cores got me about 1.5x the speed and a warmer server room.

I would love to have good threading support in Python. It’s not always the right solution, but there are a lot of circumstances where it’d be absolutely peachy, and today we’re faking our way around its absence with whole playbooks of alternative approaches to avoid the elephant in the room.

But yes, use async when it makes sense. It’s a thing of beauty. (Yes, Glyph, we hear the “I told you so!” You were right.)

zahlman [3 hidden]5 mins ago
> That sucked in its own way because I had to serialize lots of large data structures to pass around, so 2x the cores got me about 1.5x the speed and a warmer server room.

In many cases you can't reasonably expect better than that (https://en.wikipedia.org/wiki/Amdahl's_law). If your algorithm involves sharing "large data structures" in the first place, that's a bad sign.

0x000xca0xfe [3 hidden]5 mins ago
I know it's just an AI image... but a snake with two tails? C'mon!
vpribish [3 hidden]5 mins ago
shh. don't complain too loudly or we'll lose an important tell. python articles using snake illustrations can usually be ignored because they are not clueful.

-- python, monty

brookst [3 hidden]5 mins ago
Confusoborus
amelius [3 hidden]5 mins ago
The snake in the header image appears to have two tail-ends ...
cestith [3 hidden]5 mins ago
I guess it’s spawned a second thread in the same process.
aitchnyu [3 hidden]5 mins ago
Whats currently stopping me (apart from library support) from running a single command that starts up WSGI workers and Celery workers in a single process?
gchamonlive [3 hidden]5 mins ago
Nothing, it's just that these aren't first class features of the language. Also someone already explained that the GIL is mostly about technical debt in the CPython interpreter, so there are reasons other than full parallelism to get rid of the GIL.
pawanjswal [3 hidden]5 mins ago
This is some serious groundwork for the next era of performance!
make3 [3 hidden]5 mins ago
I hate how these threads always devolve in insane discussions about why not using threads is better, while most real world people who have tried to do real world speeding up of Python code realize how amazing it would be to have proper threads with shared memory instead of the processes that have so many limitations, like forcing to pickle objects back and forth, & fork so often just not working in the cloud setting, & spawn being so slow in a lot of applications. The usage of processes is just much heavier and less straightforward.
p0w3n3d [3 hidden]5 mins ago
Look behind! A free-threaded Python!
GarrickDrgn [3 hidden]5 mins ago
[flagged]
ilija139 [3 hidden]5 mins ago
Perhaps is intentionally, to allude to the multithreaded nature of free-threaded Python.
DonHopkins [3 hidden]5 mins ago
Maybe it's an aroused male snake. Looks like he's really into that bobbin!
rowanG077 [3 hidden]5 mins ago
I thought it was a play on multiple threads. Besides why would it matter? This is to the same as someone saying when digital drawing came up that they can't take an article seriously because some fluffy drawing was digitally drawn instead of hand drawn on paper.
sylware [3 hidden]5 mins ago
Got myself a shiny python 3.13.3 (ssl module still unable to compile with libressl) replacing a 3.12.2, feels clearly slower.

What's wrong?

jdsleppy [3 hidden]5 mins ago
Did you compile the Python yourself? If so, you may need to add optimization flags https://devguide.python.org/getting-started/setup-building/i...
ipsum2 [3 hidden]5 mins ago
python 3.13 doesn't ship with free-threaded Python compiled AFAIK.
sylware [3 hidden]5 mins ago
You mean it is not default anymore?
ipsum2 [3 hidden]5 mins ago
It's never been the default.
sylware [3 hidden]5 mins ago
huh... then why it feels significantly slower since I did not touch the build conf.
EGreg [3 hidden]5 mins ago
I thought this was mostly a solved problem.

  Fibers
  Green threads
  Coroutines
  Actors
  Queues (eg GCD)
  …
Basically you need to reason about what your thing will do.

Separate concerns. Each thing is a server (microservice?) with its own backpressure.

They schedule jobs on a queue.

The jobs come with some context, I don’t care if it’s a closure on the heap or a fiber with a stack or whatever. Javascript being single threaded with promises wastefully unwinds the entire stack for each tick instead of saving context. With callbacks you can save context in closures. But even that is pretty fast.

Anyway then you can just load-balance the context across machines. Easiest approach is just to have server affinity for each job. The servers just contain a cache of the data so if the servers fail then their replacements can grab the job from an indexed database. The insertion and the lookup is O(log n) each. And jobs are deleted when done (maybe leaving behind a small log that is compacted) so there are no memory leaks.

Oh yeah and whatever you store durably should be sharded and indexed properly, so practicalkt unlimited amounts can be stored. Availability in a given share is a function of replicating the data, and the economics of it is that the client should pay with credits for every time they access. You can even replicate on demand (like bittorrent re-seeding) to handle spikes.

This is the general framework whether you use Erlang, Go, Python or PHP or whatever. It scales within a company and even across companies (as long as you sign/encrypt payloads cryptographically).

It doesn’t matter so much whether you use php-fpm with threads, or swoole, or the new kid on the block, FrankenPHP. Well, I should say I prefer the shared-nothing architecture of PHP and APC. But in Python, it is the same thing with eg Twisted vs just some SAPI.

You’re welcome.

kccqzy [3 hidden]5 mins ago
It's only a mostly solved problem for concurrent I/O heavy workloads. It's not solved in the Python world for parallel CPU-bound workloads.
hello_computer [3 hidden]5 mins ago
Opting to enable low-level parallelism for user code in an imperative, dynamically typed scripting language seems like regression. It’s less bad for LISP because of the pure-functional nature. It’s less bad for BEAM languages & Clojure due to immutability. It is less bad for C/C++/Rust because you have a stronger type system—allowing for deeper static analysis. For Python, this is “high priests of a low cult” shitting things up for corporate agendas and/or street cred.
bgwalter [3 hidden]5 mins ago
This is just an advertisement for the company. Fact is, free-threading is still up to 50% slower, the tail call interpreter isn't much faster at all, and free-threading is still flaky.

Things they won't tell you at PyCon.

lenerdenator [3 hidden]5 mins ago
I don't see how any of that's a problem given that it's not the default for how people run Python.

It's a big project that's going to take lots of time by lots of people to finish. Keep it behind opt-in, keep accepting pull requests after rigorous testing, and it's fine.

tomrod [3 hidden]5 mins ago
QuantSight isn't a formal company though, it's a skunkworks/OSS research group run by the Travis Oliphant.
henry700 [3 hidden]5 mins ago
I find it peculiar how, in a language so riddled with simple concurrency architectural issues, the approach is to painstankingly fix every library after fixing the runtime, instead of just using some better language. Why does the community insist on such a bad language when literally even fucking Javascript has a saner execution model?
dash2 [3 hidden]5 mins ago
I think the opposite. Every language has flaws. What's impressive about Python is their ongoing commitment to work on theirs, even the deepest-rooted. It makes me optimistic that this is a language to stick with for the long run.
rednafi [3 hidden]5 mins ago
I agree about using other languages that have better concurrency support if concurrency is your bottleneck.

But changing the language in a brownfield project is hard. I love Go, and these days I don’t bother with Python if I know the backend needs to scale.

But Python’s ecosystem is huge, and for data work, there’s little alternative to it.

With all that said, JavaScript ain’t got shit on any language. The only good thing about it is Google’s runtime, and that has nothing to do with the language. JS doesn’t have true concurrency and is a mess of a language in general. Python is slow, riddled with concurrency problems, but at least it’s a real language created by a guy who knew what he was doing.

mylons [3 hidden]5 mins ago
i find it peculiar how tribal people are about languages. python is fantastic. you're not winning anyone over with comments like this. just go write your javascript and be happy, bud.
forrestthewoods [3 hidden]5 mins ago
> instead of just using some better language

Python the language is pretty bad. Python the ecosystem of libraries and tools has no equal, unfortunately.

Switching a language is easy. Switching a billion lines of library less so.

And the tragic part is that many of the top “python libraries” are just Python interfaces to a C library! But if you want to switch to a “better language” that fact isn’t helpful.

kubb [3 hidden]5 mins ago
I wonder if we get automatic LLM translation of codebases from language to language soon - this could close the library gap and diminish the language lock in factor.