The world of PostgreSQL wire compatibility
A wire protocol is the format for interactions between a database server and its clients. It encompasses authentication, sending queries, receiving responses, and so on. It is a description of the exact bytes sent and received by servers and clients. It does NOT encompass the actual query language itself, let alone database semantics.
Every server-client database has a wire protocol. (SQLite is an embedded database, not a server-client database, so it does not have a wire protocol.) Wire protocols are important to both database developers and database driver developers.
Proprietary databases like Oracle and IBM Db2 find value in developing their own drivers. They don't tend to publish their wire protocol. Among other reasons, it's just not necessary since they develop their own drivers. Proprietary vendors tend to prefer a monolithic driver like ODBC/JDBC rather than native drivers since it's easier to develop one driver for all languages than one for each language. If someone wants to develop a native driver for a database without a published wire protocol they'll need to reverse-engineer the wire protocol. An example of this is the go-ora Oracle client for Go.
Worth noting, among proprietary vendors, SQL Server actually does the opposite. Their wire protocol has been publicly documented for over a decade. And more native drivers exist for SQL Server than for other proprietary vendors with their own wire protocol.
In contrast, the wire protocols for open-source databases tend to be documented since it's just another part of an open-source system. Due to their popularity and age, PostgreSQL and MySQL wire protocols are particularly well documented. Native drivers exist for both databases in every major language.
For this reason, a large number of newer database vendors just reuse the PostgreSQL or MySQL wire protocol. This means that every existing driver (and SQL GUI like DataStation) will work with them without any modification to driver code. This doesn't mean that any PostgreSQL or MySQL query will work with them since, as mentioned previously, query language and database semantics are independent of the wire protocol.
In Practice
Let's write a simple program that can handle query text and read data from disk. If it implements the PostgreSQL wire protocol protocol, it would work with any existing PostgreSQL client. That would be neat!
Someone
already wrote
a small library implementing the PostgreSQL server protocol in
Python2. I ported
it to Python3 for the sake of this tutorial. So we can just grab
that Python3 fork's code and edit the query
method with
some code that handles the query text and reads data from disk, and
we'd be done!
To reiterate, the wire protocol doesn't specify anything about the
query language. So let's support a single made-up
query: SELECT FROM x
where x
can be a
valid table name:
$ diff original.py pgserver.py
0a1
> import json
189,190c190,202
< fields = [IntField('a'), IntField('b')]
< rows = [[1, 2], [3, 4], [5, 6]]
---
> sql = sql.decode().upper()
> prefix = 'SELECT FROM '
> if not sql.startswith(prefix):
> raise Exception('Bad query.')
>
> table = sql.strip()[len(prefix):-2].strip().lower()
> try:
> with open('data/'+table+'.json') as f:
> d = json.load(f)
> fields = [IntField(k) for k in d[0].keys()]
> rows = [[row[field.name] for field in fields] for row in d]
> except Exception as e:
> raise Exception('Invalid table.') from e
Now let's create a data
directory where
all the table JSON files will be stored. And let's create a few
tables:
$ cat data/users.json
[
{"id": 0, "age": 100},
{"id": 1, "age": 45},
{"id": 2, "age": 52}
]
$ cat data/users_organization.json
[
{"user_id": 0, "organization_id": 0},
{"user_id": 1, "organization_id": 1},
{"user_id": 2, "organization_id": 0}
]
Run it:
$ python pgserver.py
server running, try: $ psql -h localhost -p 55432
And in a separate terminal, open psql as suggested (enter any text when prompted for a password, this code doesn't do anything with it), and run some queries:
$ psql -h localhost -p 55432
Password for user phil:
psql (13.4, server 0.0.0)
Type "help" for help.
phil=> select from users;
id | age
----+-----
0 | 100
1 | 45
2 | 52
(3 rows)
phil=> select from users_organization;
user_id | organization_id
---------+-----------------
0 | 0
1 | 1
2 | 0
(3 rows)
That is really neat! Incidentally a similar technique can be used to help in testing a la pgmock.
Vendors
But let's get back to the real world, dig into some of the databases that implement the PostgreSQL wire protocol, and see where else they diverge.
And although this table mentions compatibility, don't take that to be a judgement call. Just because a database implements the PostgreSQL wire protocol does not mean it intends to be a drop-in replacement for PostgreSQL.