> MongoDB still leads on single-record writes, while SurrealDB is ~1.3× faster on reads
I greatly appreciate when a vendor is willing to run the test and publish unfavorable information, even if it's only in one benchmark category.
pgaddict [3 hidden]5 mins ago
I really wish they clearly documented the parameters used by each of the databases (or are we expected to dig those out of the .rs sources somehow?), and actual versions used (saying "Postgres" is ambiguous, it could be 14 or 18 - presumably it's 18, but who knows).
2-16GB of WAL is not a lot, but I have no idea how large is the data set.
itsezc [3 hidden]5 mins ago
We definitely should make that clearer in the docs (thanks for highlighting this). The Postgres image used is Postgres 17.
On the parameters, the relational tests use 5 million records per test. The exceptions are the key-value category, which uses 15 million records, and the embedded category, which uses 1 million records. The same dataset shape, workload, harness, and hardware are used across the engines being compared.
For WAL, the 2 to 16 GB range is not intended to be a limit based on the dataset size. For the published runs, the dataset is small enough that this should not be a bottleneck. The persistent runs are also full-durability runs, with Postgres using fsync and synchronous_commit.
We will update the benchmarks page so the versions, dataset sizes, and tuning details are easier to find without digging through the Rust source.
rancar2 [3 hidden]5 mins ago
Can you push the results in a benchmark repo under the GitHub account linked from the parent Readme of the project?
The full transparency would be very helpful to know where these strengths are coming from which at a glance look to be multi-threaded in-memory processing.
llimllib [3 hidden]5 mins ago
the surreal docs should not say "surreal is open source", it's source-available under the BSL
redwood [3 hidden]5 mins ago
I'd love to see the OSI mature it's license taxonomy for 2026
limagnolia [3 hidden]5 mins ago
The OSI definition is just fine for software licenses. It doesn't need to "mature". The BSL is not compatible with the principles and concept of open source software.
redwood [3 hidden]5 mins ago
I disagree. Current posture makes it nearly impossible for European companies to build software centric open source businesses.
limagnolia [3 hidden]5 mins ago
Why do you think the OSI should be concerned with European companies and their business models? Why would European companies have a more difficult time building software centric open source businesses than companies in other countries?
esafak [3 hidden]5 mins ago
I'm pleased to see SurrealDB making progress. I look forward to the distributed comparisons. Can you also report TPC benchmarks? Also, do you have any write ups on your correctness- and regression testing approach?
arpinum [3 hidden]5 mins ago
Benchmark dataset fits in memory. Not a good test.
PunchyHamster [3 hidden]5 mins ago
So, just use PostgreSQL? 50% faster write at cost of 25% slower reads (which usually are prevailing workload) doesn't warrant moving into far smaller ecosystem
marcyb5st [3 hidden]5 mins ago
Not in anyway related with the surreal folks.
I think it is not so clear cut. I mean, the multi-model nature it is pretty neat. Yes, you can use pgvector on PostgreSQL, but here you also have native graph support. If you want to have both you need to also add something like apache AGE, but arguably that is also a small ecosystem (at least IMHO as I never heard it until I actually started looking for Neo4J alternatives). Also, pgvector has a hard limit on embedding size, while surrealdb does not. For instances in which you have less than 1M elements and retrieval performance matters surreal already has an advantage.
In my personal opinion is a great overall product. Probably not the best at anything, but close enough without having to fiddle with PostgreSQL extensions or adding another piece of machinery to support graph workloads.
The only thing I don't like is that they didn't use either pure SQL nor Cipher for the query(ies) language(s). They roll their own blend, meaning that you will likely need more work to move in the ecosystem and you can't fully use the muscle memory of users that worked with other DBs before.
stevefan1999 [3 hidden]5 mins ago
Projects that's using already existing that is using Postgres already should keep it in Postgres.
It is worth a try for startups if you won't mind. Try to vibe code around it and give the data model a new look. I have a prototype project that combines both tree-sitter AST and converted it to JSON, then since SurrealDB accepts JSON as native input I now get free graph lookup on the control flow and easily did ancestry analysis and finding what functions potentially calls to this segement. All of it is in SurrealDB nested graph queries and the performance is alright, but is abysmal in Postgres JSONB since JSONB does not linearize the JSON data structure.
ps: I'm building a K8S operator for deploying SurrealDB with TiKV operator integration too.
hilariously [3 hidden]5 mins ago
Unless you want to build a startup specifically around that new hot database to do something very specific that's hard with other systems, do not build your startup around a hot new database.
The innovation points you spend on this should generally be spent in other areas, not seeing if someone's unproven db is your breadwinner.
stevefan1999 [3 hidden]5 mins ago
Well, I'm still in a very early phase, but I'm indeed combining both Restate and SurrealDB together for a project that I'm building, where I persist the temporary state on Restate and permanent state on SurrealDB, and since both uses JSON as its lingua franca, it is pretty easy to serailize data between Restate and SurrealDB, very much so better than using MongoDB with BSON as many people would have naturally thought of what is supposed to be a better replacement than SurrealDB.
Oh, that's the reason the SurrealDB operator was here in the first place because I need the full K8S lifecycle to maintain the database state such as backing up, that is not really doable with Helm.
itsezc [3 hidden]5 mins ago
Your reasoning is very solid, and something I'd also consider before picking a DB.
No one should pick us because we're the new hot thing (at least I'd hope not). But at SurrealDB, we've got real enterprises in production at scale. For a lot of startups building today, having LLM/vector features, graph, auth, and the database in one place can really help you ship faster without stitching a bunch of tools together.
hilariously [3 hidden]5 mins ago
A good point, though I think most people take the stitch it together approach because they either have existing processes or proven tech (so startups make sense) - however generally picking an all in one often means that the base cases (like getting started) are awesome, but the edge cases are a razors edge (no familiarity with your product).
As a former DBA I got to see the general purpose databases bolt on a lot of shitty addons, and a lot of upstarts build just enough to get the sale done (or targeting bigger customers than I) - I hope y'all can get enough polish and reliability done and grow into something I want to use in five years :)
bluGill [3 hidden]5 mins ago
What is correct depends on your workload. There is never a case for comparing the performance for Postgres vs Redis. The are intended for very different uses and so they are should never substitute, as a feature analysis will reveal which you really need.
Though to be honest most people won't scale enough that DB performance is important in the first place. For most people they don't even need a database, your language has built in containers that will do everything you need.
itsezc [3 hidden]5 mins ago
Postgres is definitely one of the strongest databases out there, and we are not trying to hand-wave that away with benchmarks. The point is more that SurrealDB v3 is getting much closer on raw performance while offering a multi-model database, which feels especially relevant today.
On the ecosystem side, we have also grown a lot over the last few years across the community, integrations, cloud offering, and customers. Still work to do, but we are not as far off as people might assume.
redwood [3 hidden]5 mins ago
Benchmarks are so misleading not only because of all the tunable parameters to make these hardly an Apples to Apples comparison, more fundamentally the way you would optimize for different engines involves different trade-offs, different data models with different indexes...
I greatly appreciate when a vendor is willing to run the test and publish unfavorable information, even if it's only in one benchmark category.
For example, I see they do this for Postgres:
`let max_wal_gb = (shared_buffers_gb).clamp(2, 16);`
2-16GB of WAL is not a lot, but I have no idea how large is the data set.
On the parameters, the relational tests use 5 million records per test. The exceptions are the key-value category, which uses 15 million records, and the embedded category, which uses 1 million records. The same dataset shape, workload, harness, and hardware are used across the engines being compared.
For WAL, the 2 to 16 GB range is not intended to be a limit based on the dataset size. For the published runs, the dataset is small enough that this should not be a bottleneck. The persistent runs are also full-durability runs, with Postgres using fsync and synchronous_commit.
We will update the benchmarks page so the versions, dataset sizes, and tuning details are easier to find without digging through the Rust source.
The full transparency would be very helpful to know where these strengths are coming from which at a glance look to be multi-threaded in-memory processing.
I think it is not so clear cut. I mean, the multi-model nature it is pretty neat. Yes, you can use pgvector on PostgreSQL, but here you also have native graph support. If you want to have both you need to also add something like apache AGE, but arguably that is also a small ecosystem (at least IMHO as I never heard it until I actually started looking for Neo4J alternatives). Also, pgvector has a hard limit on embedding size, while surrealdb does not. For instances in which you have less than 1M elements and retrieval performance matters surreal already has an advantage.
In my personal opinion is a great overall product. Probably not the best at anything, but close enough without having to fiddle with PostgreSQL extensions or adding another piece of machinery to support graph workloads.
The only thing I don't like is that they didn't use either pure SQL nor Cipher for the query(ies) language(s). They roll their own blend, meaning that you will likely need more work to move in the ecosystem and you can't fully use the muscle memory of users that worked with other DBs before.
It is worth a try for startups if you won't mind. Try to vibe code around it and give the data model a new look. I have a prototype project that combines both tree-sitter AST and converted it to JSON, then since SurrealDB accepts JSON as native input I now get free graph lookup on the control flow and easily did ancestry analysis and finding what functions potentially calls to this segement. All of it is in SurrealDB nested graph queries and the performance is alright, but is abysmal in Postgres JSONB since JSONB does not linearize the JSON data structure.
ps: I'm building a K8S operator for deploying SurrealDB with TiKV operator integration too.
The innovation points you spend on this should generally be spent in other areas, not seeing if someone's unproven db is your breadwinner.
Oh, that's the reason the SurrealDB operator was here in the first place because I need the full K8S lifecycle to maintain the database state such as backing up, that is not really doable with Helm.
No one should pick us because we're the new hot thing (at least I'd hope not). But at SurrealDB, we've got real enterprises in production at scale. For a lot of startups building today, having LLM/vector features, graph, auth, and the database in one place can really help you ship faster without stitching a bunch of tools together.
As a former DBA I got to see the general purpose databases bolt on a lot of shitty addons, and a lot of upstarts build just enough to get the sale done (or targeting bigger customers than I) - I hope y'all can get enough polish and reliability done and grow into something I want to use in five years :)
Though to be honest most people won't scale enough that DB performance is important in the first place. For most people they don't even need a database, your language has built in containers that will do everything you need.
On the ecosystem side, we have also grown a lot over the last few years across the community, integrations, cloud offering, and customers. Still work to do, but we are not as far off as people might assume.