Postgres: Difference between revisions
(Created page with "=== NOSQL via JSON === GORGEOUS POSTGRES JSON QUERY! inject blobs of JSON, then you can crack open the json inside a cell! CREATE TABLE stocks ( id integer NOT NULL...") |
(→Config) |
||
(65 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=== | === Usage === | ||
==== PGAdmin ==== | |||
PGAdmin is running on bitpost and can connect to any db on the LAN. | |||
==== Make a copy of a table ==== | |||
CREATE TABLE new_table AS SELECT * FROM existing_table (WHERE ...); | |||
==== Readable UNIX epoch ==== | |||
select to_timestamp((json->>'timestamp')::int) at time zone 'utc', * from stockquotes where (json->>'quote')::float < 0.0; | |||
Yes it's that ridiculous. Yes it's [https://dba.stackexchange.com/a/80114 really even worse than that]. | |||
==== Table sizes ==== | |||
More ridiculous postgres overengineered obfuscated NONSENSE: | |||
SELECT nspname || '.' || relname AS "relation", | |||
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" | |||
FROM pg_class C | |||
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) | |||
WHERE nspname NOT IN ('pg_catalog', 'information_schema') | |||
AND C.relkind <> 'i' | |||
AND nspname !~ '^pg_toast' | |||
ORDER BY pg_total_relation_size(C.oid) DESC; | |||
Optionally add LIMIT 5 or whatever. | |||
=== Indexing === | |||
My discoveries: | |||
* use single arrow not double on root of any ::bigint or other :: conversion! to avoid double-convert | |||
* use :: on any where/order/join field! | |||
* Always carefully precisely match index fields and query fields. Single-arrow that shit! [https://dba.stackexchange.com/questions/297734/postgresql-multi-field-index-on-jsonb-data-does-full-tablescan-when-field-is-not Me being dumb in public] | |||
Things are working pretty well. | |||
* NEXT: EXPLORE INDEXING further! | |||
https://blog.codeship.com/unleash-the-power-of-storing-json-in-postgres/ | |||
=== JSONB Usage === | |||
Read [https://www.postgresql.org/docs/current/functions-json.html this] carefully! | |||
* Add columns of type 'jsonb' (JSON binary). | |||
* Inject cells of any valid JSON format (big data!). | |||
* Then using special syntax, you can crack open the json inside a cell to query, create, update, delete: | |||
CREATE TABLE stocks ( | CREATE TABLE stocks ( | ||
id integer NOT NULL, | id integer NOT NULL, | ||
data jsonb | data jsonb | ||
); | ); | ||
INSERT INTO stocks VALUES ( | INSERT INTO stocks VALUES (1, '{"symbol": "GOOG", "tags": ["hot", "trending"], "owned": true}, "quantity": 12'); | ||
# returns count of data that HAS the field ingredients | # returns count of data that HAS the field ingredients | ||
SELECT count(*) FROM stocks WHERE data ? ' | SELECT count(*) FROM stocks WHERE data ? 'tags'; | ||
# ->> will return the property as integer or text (the parsed form of the attribute). | # ->> will return the property as integer or text (the parsed form of the attribute). | ||
SELECT * FROM stocks WHERE data->>'owned' = 'true'; # if data.owned == true | SELECT * FROM stocks WHERE data->>'owned' = 'true'; # if data.owned == true | ||
# you may need to cooerce the type, sigh... | |||
SELECT * FROM stocks WHERE (data->>'quantity')::int > 5; | |||
# -> will return the attribute as a JSON object | # -> will return the attribute as a JSON object | ||
Line 23: | Line 66: | ||
(2 rows) | (2 rows) | ||
==== Field(s) update ==== | |||
So easy with v12 syntax: | |||
UPDATE mytable SET json = json || '{"order_id": ""}'; | |||
update prefint set json = json || '{"value":1}' where json->>'name' = 'LoggingLevel'; | |||
==== Field(s) delete ==== | |||
UPDATE mytable SET json = json - 'custom_order_id'; | |||
==== Upserts ==== | |||
Upsert example, [https://stackoverflow.com/questions/42567972/postgres-upsert-json-props from SO]: | |||
INSERT INTO accounts(id, json) VALUES ('1', '{##somejson##}'); | |||
INSERT INTO accounts(id, json) | |||
VALUES ('1', '{##somejson##}') | |||
ON CONFLICT (id) DO | |||
UPDATE SET json = '{##somejson##}' | |||
RETURNING *; | |||
==== Fix null values ==== | |||
update autotradeparametersets set json = json || '{"analysis_percent_change": 0.0}' where (json->'analysis_percent_change')::text = 'null'; | |||
=== Install === | |||
* You can either install the system postgres server: | |||
mh-install-postgres | |||
Or install the client so you can develop code that connects to a remote server: | |||
mh-install-postgres-client | |||
* The config files are here (created and stowed by the script). You should read and update them to adjust the performance settings to match the hardware. | |||
/var/lib/postgresql/13/main/[postgresql.conf,pg_hba.conf] | |||
# symlinked to... | |||
/home/m/development/config/ubuntu/<hostname>/var/lib/postgresql/13/main/[postgresql.conf,pg_hba.conf] | |||
Always change the postgres user's default password: | |||
sudo su postgres | |||
psql | |||
\password postgres | |||
(Enter new password... add it to private) | |||
\q | |||
==== Start server ==== | |||
Server should be a standard service that starts on boot. | |||
sudo service postgresql [status|etc..] | |||
Use mh-start-postgres to start postgres (if the server is not already running): | |||
mh-start-postgres | |||
Once a new server is started, verify you can connect to it via our LAN PGAdmin. | |||
==== ABT postgres tools ==== | |||
* Set up databases - Note that [at setup] will take care of this for you... | |||
mh-add-postgres-db at_live | |||
mh-add-postgres-db at_test | |||
* To open a shell: | |||
at dpg | |||
at ppg | |||
=== Config === | |||
==== Configure remote access ==== | |||
postgresql.conf: | |||
# MDM We must also LISTEN if we want to allow any remote connections. | |||
# Also see pg_hba.conf for remote permission details. | |||
listen_addresses = '*' | |||
pg_hba.conf: | |||
# MDM my lan connection | |||
host all all abtdev1.bitpost.lan md5 | |||
host all all cast.bitpost.lan md5 | |||
==== Add a user + db ==== | |||
Use mh-add-postgres-db to add a user (aka role) and matching database. The user only has access to that db (best practice). This will prompt for pw. | |||
mh-add-postgres-db [name] | |||
# Add name+pw to private | |||
==== Backup ==== | |||
Look at [at import] and [at export], they do the work. | |||
=== Troubleshooting === | |||
==== Finding bad huge data ==== | |||
* I had bad data (huge string written to a json value) and found it by looking at the json length not the actual json (which was huge). | |||
select id, length(json::text) from cycles; | |||
Then I could clean up the rogue field with: | |||
update cycles set json = json || '{"order_id": ""}'; | |||
==== Updating old passwords to use latest cipher ==== | |||
Cipher recommendation changed from md5 to scram-sha-256. Add this line to upgrade from md5 to scram: | |||
sudo emacs -nw /etc/postgresql/12/main/postgresql.conf | |||
# MDM upgrade from md5 to scram | |||
password_encryption = scram-sha-256 | |||
After upgrading and restarting postgres, old md5 passwords will start to fail. To update an old password, just retype them: | |||
sudo su postgres | |||
psql | |||
\password ##username## | |||
==== Old notes that may help ==== | |||
Postgres is a NASTY BEAST with SHITTY DEFAULT CONFIGS... | |||
* set up a linux user named "postgres" | |||
# On gentoo: | |||
sudo useradd postgres | |||
# On ubuntu: | |||
sudo adduser postgres | |||
# Put the password in private | |||
* set up a data folder where postgres will put all its data and its log | |||
# Standardize on [/home/m/development/postgres], but always consider symlinking to a large/redundant location. | |||
# Here, we are choosing to use a larger RAID drive. | |||
# WARNING NTFS will not let you chown a folder (you'd have to mount the whole thing as the target owner). | |||
# So on viper, I had to use the tiny primary drive directly, no symlink to ~m/bigdata. Fuck off. | |||
mkdir /spiceflow/softraid/development/postgres | |||
cd ~/development && ln -s /spiceflow/softraid/development/postgres postgres | |||
sudo chown -R postgres:postgres postgres /spiceflow/softraid/development/postgres | |||
sudo chmod g+x /spiceflow/softraid/development # and any other parent that is missing this bit, so postgres user can cd to its folder | |||
# We must init the data folder as a postgres "database cluster" | |||
su postgres -c '/usr/local/bin/initdb -D /home/m/development/postgres/data' | |||
'''Backup''' | |||
To dump a database called mydb into a SQL-script file: | |||
$ pg_dump mydb > db.sql | |||
To reload such a script into a (freshly created) database named newdb: | |||
$ psql -d newdb -f db.sql | |||
See [https://www.postgresql.org/docs/11/app-pgdump.html here] for more. | |||
''' Disabling multiple installs ''' | |||
There may be one or more system postgres'es already installed. That's fine, just make sure they don't start. | |||
sudo emacs -nw /etc/postgresql/(tab-tab) | |||
sudo emacs -nw /etc/postgresql/10/main/start.conf # change auto > manual | |||
sudo emacs -nw /etc/postgresql/9.6/main/start.conf # change auto > manual | |||
sudo service postgres(tab-tab to determine if there are more than one) stop | |||
On Ubuntu, watch out for multiple installs causing the postgres port to change from the default. If you have two previous postgres versions, version 12 will use port 5432+1+1 = 5434. This is very confusing. You can see what postgres versions are running on what port in Ubuntu with this: | |||
pg_lsclusters | |||
''' Configure ''' | |||
You need to create two config files, postgresql.conf and pg_hba.conf. | |||
There is a great configuration tool [https://pgtune.leopard.in.ua/#/ here] to generate postgresql.conf. Example result: | |||
<pre> | |||
# DB Version: 10 | |||
# OS Type: linux | |||
# DB Type: web | |||
# Total Memory (RAM): 16 GB | |||
# CPUs num: 4 | |||
# Connections num: 1000 | |||
# Data Storage: hdd | |||
max_connections = 1000 | |||
shared_buffers = 4GB | |||
effective_cache_size = 12GB | |||
maintenance_work_mem = 1GB | |||
checkpoint_completion_target = 0.7 | |||
wal_buffers = 16MB | |||
default_statistics_target = 100 | |||
random_page_cost = 4 | |||
effective_io_concurrency = 2 | |||
work_mem = 2097kB | |||
min_wal_size = 1GB | |||
max_wal_size = 2GB | |||
max_worker_processes = 4 | |||
max_parallel_workers_per_gather = 2 | |||
max_parallel_workers = 4 | |||
</pre> | |||
You must provide the pg_hba.conf file to specify access control. Copy it from other installs. Make sure to add any host-to-host access you need. | |||
Here is the location to put the config files when installed from source with [mh-install-postgres]: | |||
/home/m/development/postgres/data/[postgresql.conf,pg_hba.conf] |
Latest revision as of 12:14, 12 September 2023
Usage
PGAdmin
PGAdmin is running on bitpost and can connect to any db on the LAN.
Make a copy of a table
CREATE TABLE new_table AS SELECT * FROM existing_table (WHERE ...);
Readable UNIX epoch
select to_timestamp((json->>'timestamp')::int) at time zone 'utc', * from stockquotes where (json->>'quote')::float < 0.0;
Yes it's that ridiculous. Yes it's really even worse than that.
Table sizes
More ridiculous postgres overengineered obfuscated NONSENSE:
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC;
Optionally add LIMIT 5 or whatever.
Indexing
My discoveries:
- use single arrow not double on root of any ::bigint or other :: conversion! to avoid double-convert
- use :: on any where/order/join field!
- Always carefully precisely match index fields and query fields. Single-arrow that shit! Me being dumb in public
Things are working pretty well.
- NEXT: EXPLORE INDEXING further!
https://blog.codeship.com/unleash-the-power-of-storing-json-in-postgres/
JSONB Usage
Read this carefully!
- Add columns of type 'jsonb' (JSON binary).
- Inject cells of any valid JSON format (big data!).
- Then using special syntax, you can crack open the json inside a cell to query, create, update, delete:
CREATE TABLE stocks ( id integer NOT NULL, data jsonb ); INSERT INTO stocks VALUES (1, '{"symbol": "GOOG", "tags": ["hot", "trending"], "owned": true}, "quantity": 12');
# returns count of data that HAS the field ingredients SELECT count(*) FROM stocks WHERE data ? 'tags';
# ->> will return the property as integer or text (the parsed form of the attribute). SELECT * FROM stocks WHERE data->>'owned' = 'true'; # if data.owned == true
# you may need to cooerce the type, sigh... SELECT * FROM stocks WHERE (data->>'quantity')::int > 5;
# -> will return the attribute as a JSON object SELECT jsonb_array_elements_text(data->'tags') as tag FROM stocks WHERE id = 1; # returns the actual tag values, eg: tag -------------- Improvements Office (2 rows)
Field(s) update
So easy with v12 syntax:
UPDATE mytable SET json = json || '{"order_id": ""}'; update prefint set json = json || '{"value":1}' where json->>'name' = 'LoggingLevel';
Field(s) delete
UPDATE mytable SET json = json - 'custom_order_id';
Upserts
Upsert example, from SO:
INSERT INTO accounts(id, json) VALUES ('1', '{##somejson##}'); INSERT INTO accounts(id, json) VALUES ('1', '{##somejson##}') ON CONFLICT (id) DO UPDATE SET json = '{##somejson##}' RETURNING *;
Fix null values
update autotradeparametersets set json = json || '{"analysis_percent_change": 0.0}' where (json->'analysis_percent_change')::text = 'null';
Install
- You can either install the system postgres server:
mh-install-postgres
Or install the client so you can develop code that connects to a remote server:
mh-install-postgres-client
- The config files are here (created and stowed by the script). You should read and update them to adjust the performance settings to match the hardware.
/var/lib/postgresql/13/main/[postgresql.conf,pg_hba.conf] # symlinked to... /home/m/development/config/ubuntu/<hostname>/var/lib/postgresql/13/main/[postgresql.conf,pg_hba.conf]
Always change the postgres user's default password:
sudo su postgres psql \password postgres (Enter new password... add it to private) \q
Start server
Server should be a standard service that starts on boot.
sudo service postgresql [status|etc..]
Use mh-start-postgres to start postgres (if the server is not already running):
mh-start-postgres
Once a new server is started, verify you can connect to it via our LAN PGAdmin.
ABT postgres tools
- Set up databases - Note that [at setup] will take care of this for you...
mh-add-postgres-db at_live mh-add-postgres-db at_test
- To open a shell:
at dpg at ppg
Config
Configure remote access
postgresql.conf:
# MDM We must also LISTEN if we want to allow any remote connections. # Also see pg_hba.conf for remote permission details. listen_addresses = '*'
pg_hba.conf:
# MDM my lan connection host all all abtdev1.bitpost.lan md5 host all all cast.bitpost.lan md5
Add a user + db
Use mh-add-postgres-db to add a user (aka role) and matching database. The user only has access to that db (best practice). This will prompt for pw.
mh-add-postgres-db [name] # Add name+pw to private
Backup
Look at [at import] and [at export], they do the work.
Troubleshooting
Finding bad huge data
- I had bad data (huge string written to a json value) and found it by looking at the json length not the actual json (which was huge).
select id, length(json::text) from cycles;
Then I could clean up the rogue field with:
update cycles set json = json || '{"order_id": ""}';
Updating old passwords to use latest cipher
Cipher recommendation changed from md5 to scram-sha-256. Add this line to upgrade from md5 to scram:
sudo emacs -nw /etc/postgresql/12/main/postgresql.conf # MDM upgrade from md5 to scram password_encryption = scram-sha-256
After upgrading and restarting postgres, old md5 passwords will start to fail. To update an old password, just retype them:
sudo su postgres psql \password ##username##
Old notes that may help
Postgres is a NASTY BEAST with SHITTY DEFAULT CONFIGS...
- set up a linux user named "postgres"
# On gentoo: sudo useradd postgres # On ubuntu: sudo adduser postgres # Put the password in private
- set up a data folder where postgres will put all its data and its log
# Standardize on [/home/m/development/postgres], but always consider symlinking to a large/redundant location. # Here, we are choosing to use a larger RAID drive. # WARNING NTFS will not let you chown a folder (you'd have to mount the whole thing as the target owner). # So on viper, I had to use the tiny primary drive directly, no symlink to ~m/bigdata. Fuck off. mkdir /spiceflow/softraid/development/postgres cd ~/development && ln -s /spiceflow/softraid/development/postgres postgres sudo chown -R postgres:postgres postgres /spiceflow/softraid/development/postgres sudo chmod g+x /spiceflow/softraid/development # and any other parent that is missing this bit, so postgres user can cd to its folder # We must init the data folder as a postgres "database cluster" su postgres -c '/usr/local/bin/initdb -D /home/m/development/postgres/data'
Backup
To dump a database called mydb into a SQL-script file:
$ pg_dump mydb > db.sql
To reload such a script into a (freshly created) database named newdb:
$ psql -d newdb -f db.sql
See here for more.
Disabling multiple installs
There may be one or more system postgres'es already installed. That's fine, just make sure they don't start.
sudo emacs -nw /etc/postgresql/(tab-tab) sudo emacs -nw /etc/postgresql/10/main/start.conf # change auto > manual sudo emacs -nw /etc/postgresql/9.6/main/start.conf # change auto > manual sudo service postgres(tab-tab to determine if there are more than one) stop
On Ubuntu, watch out for multiple installs causing the postgres port to change from the default. If you have two previous postgres versions, version 12 will use port 5432+1+1 = 5434. This is very confusing. You can see what postgres versions are running on what port in Ubuntu with this:
pg_lsclusters
Configure
You need to create two config files, postgresql.conf and pg_hba.conf.
There is a great configuration tool here to generate postgresql.conf. Example result:
# DB Version: 10 # OS Type: linux # DB Type: web # Total Memory (RAM): 16 GB # CPUs num: 4 # Connections num: 1000 # Data Storage: hdd max_connections = 1000 shared_buffers = 4GB effective_cache_size = 12GB maintenance_work_mem = 1GB checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100 random_page_cost = 4 effective_io_concurrency = 2 work_mem = 2097kB min_wal_size = 1GB max_wal_size = 2GB max_worker_processes = 4 max_parallel_workers_per_gather = 2 max_parallel_workers = 4
You must provide the pg_hba.conf file to specify access control. Copy it from other installs. Make sure to add any host-to-host access you need.
Here is the location to put the config files when installed from source with [mh-install-postgres]:
/home/m/development/postgres/data/[postgresql.conf,pg_hba.conf]