Skip to content

PostgreSQL

Postgresql is open source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.

PostgreSQL features ACID (Atomicity, Consistent, Isolation and Durability) properties. It has indexes (primary/unique), updatable views, triggers, foreign keys (FKs) and even stored procedures (SPs)

Value propositions

  • Oldest open source RDBMS and advanced one, and is object-relational database
  • Large dataset
  • A lot of data types and support unstructured data too.
  • Adopted in microservice world.
  • Run in multiple process so better for horizontal scaling
  • Suited for applications with high volume of both reads and writes
  • Consider PostgreSQL for any application that might grow to enterprise scope, with complex queries and frequent write operations.
  • Includes pgbench to be used to create data and also to stress the database.

Architecture for HA

The PostgreSQL stack is comprised of a primary and replica services. WAL records are exchanged between primary and replicas.

Streaming replication is pushing changes from a primary PostgreSQL instance to its replicas.

My Projects using postgresql

  • Vaccine order mgr
  • In this project there is a copy of Quarkus - panache - postgresql quickstart with settings to access remote postgresql on IBM Cloud and kubernetes template for a secret to get URL, user and password to access the DB.
  • Autonomous Car Ride

Create Postgres databases

Run Postgres locally

Use docker command:

docker run --ulimit memlock=-1:-1 -it --rm=true --memory-swappiness=0 --name pgdb -e POSTGRES_USER=pguser -e POSTGRES_PASSWORD=passw0rd -e POSTGRES_DB=bettertodo -p 5432:5432 postgres:10.5

Or set the environment variables POSTGRESQL_USER,POSTGRESQL_HOST, POSTGRESQL_PWD in the .env script and then use the command: source .env.

Under the postgresql folder:

  • start the docker image for the database: ./startPostgresqlLocal.sh
  • Start bash in a postgres image to access psql: ./startPsql.sh LOCAL

another way is to use docker compose:

 postgresql:
    container_name: postgres
    hostname: postgres
    image: postgres
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: pgpwd
      POSTGRES_DB: ordersdb
    ports:
      - "5432:5432"
    volumes:
      - ./database:/var/lib/postgresql/data/

If you need to add table creation script add the following lines in the volumes

      # copy the sql script to create tables
      - ./sql/create_tables.sql:/docker-entrypoint-initdb.d/create_tables.sql

The official PostgreSQL Docker image https://hub.docker.com/_/postgres/ allows us to place SQL files in the /docker-entrypoint-initb.d folder, and the first time the service starts, it will import and execute those SQL files.

pgAdmin

The docker compose includes the pgAdmin UI at address http://localhost:5050. See user and password in docker compose. Register a server by using postgres as hostname, user and password.

  • We can create a new database (dvdrental for example) and load definition and data from a backup which is a tar file.

  • Use the Query Tool to execute SQL command.

We can load csv file in a table. Example from cab_rides for flink study:

  1. Be sure the csv file is mounted inside the docker container
  2. Create a table matching the columns of the csv with the expected types
CREATE TABLE cab_rides(
cab_id VARCHAR(20), 
cab_plate VARCHAR(20), 
cab_type VARCHAR(20), 
driver_name VARCHAR(100), 
ongoing_trip VARCHAR(4), 
pickup_location VARCHAR(20), 
destination VARCHAR(20), 
passenger_count INTEGER
);
  1. Enter a query like

    COPY cab_rides FROM '/tmp/data_files/cab_rides.csv' DELIMITER ',';
    # Or if the first raw has a header
    COPY cab_rides FROM '/' DELIMITER ',' CSV HEADER;
    
  2. Can also use the backup and restore functions on the database to load data and schema: use the upload file button on top left.

Some psql commands

# connect to the container
docker exec -ti pgdb bash
psql -U pguser -d dbname
# switch to another DB
\c dbname
# list existing DBs
\l
# list available tables
\dt 
# describe table
\d table_name
# list schemas
\dn
# list available funtions
\df
# views
\dv
# list users
\du
# command history
\s
# execute previoud commmand
\g
# Execute a SQL query
select * from public.tablename;

Connect to remote Postgres on IBM Cloud

./startPsql.sh IBMCLOUD

psql postgres://$POSTGRES_USER:$POSTGRES_PWD@$POSTGRES_HOST/$POSTGRES_DB

Deploy postgresql on OpenShift

Using deployment

See the script deployPostgresOnOpenShift.sh.

Using Operator

In the 'developer perspective` of OpenShift console, use the database and a postgresql without persistence, or ephemeral. Set the DB name, user and password. See this OpenShift tutorial for more info.

The environment variables are defined as secrets under the postgresql namespace: oc describe secret postgresql

Remote connect to the postgresql pod: oc rsh podid

Then start psql using the following command:

PGPASSWORD=$POSTGRESQL_PASSWORD psql -h postgresql -d $POSTGRESQL_DATABASE -U $POSTGRESQL_USER
psql (10.12)
Type "help" for help.

vaccinedb=>

To access the database with pgadmin running locally, to the remote DB, we need to do port forwarding as:

oc get pods
oc port-forward postgres-5f449ccd95-tclb6 15432:5432

In pGAdmin efine the server entry with the connection to the port 15432.

Then in the Quarkus app or in env file define properties like:

export QUARKUS_DATASOURCE_USERNAME=postgres
export QUARKUS_DATASOURCE_PASSWORD=postgres1234
export POSTGRESQL_DBNAME=postgres
export QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://localhost:15432/postgres

Some SQL examples

See main page for SQL examples on postgresql DB like dvdrentals, or facilities rental.

Exercises from medium articles

# under /tmp/scripts
psql -U postgres -f create_db.sql
  • insert records psql -U postgres -f inser-record-1.sql
  • Write an SQL query to report all customers who never order anything. Use left join to take all the values from the left table and the common rows from the right table. The left join was performed on the Customer table because we want all the Customers with their Orders.
select * from customers left join orders on customers.id = orders.customer_id;

id | name  | id | customer_id 
----+-------+----+-------------
  3 | Sam   |  1 |           3
  1 | Joe   |  2 |           1
  2 | Henry |    |            
  4 | Max   |    |   

select name from customers left join orders on customers.id = orders.customer_id where orders.customer_id is null; 
  • Write an SQL query to report the second highest salary from the Employee table. First ordered the salary in descending order to get the highest salary in the first and take only the unique salaries by using DISTINCT argument. The OFFSET argument is used to identify the starting point to return rows from a result set. LIMIT clause restricts how many rows are returned.
select distinct salary as secondhighestsalary from employes order by salary DESC limit 1 offset 1;

Create customers

Here is the complete SQL you can run in psql

CREATE TABLE customers (customer_id varchar(8) PRIMARY KEY, lastname varchar(40) NOT NULL, firstname varchar(40) NOT NULL, zipcode varchar(5), country varchar(40), status integer);
INSERT INTO customers (customer_id,lastname,firstname,zipcode,country,status) VALUES
('C01','Builder','Bob','95050','USA',1),
('C02','Destroyer','Bill','95050','USA',1),
('C03','Climber','Jack','95052','USA',1),
('C04','Messenger','John','95052','USA',1);

or use the command

psql postgres://$POSTGRES_USER:$POSTGRES_PWD@$POSTGRES_HOST/$POSTGRES_DB -a -f /home/dll/customer.sql

Create products

Products define fresh product with controlled temperature and humidity to control for the travel.

CREATE TABLE products (
    product_id varchar(64) NOT NULL PRIMARY KEY,
    description varchar(100),
    target_temperature REAL,
    target_humidity_level REAL,
    content_type integrer
);

INSERT INTO products(product_id,description,target_temperature,target_humidity_level,content_type) VALUES
('P01','Carrots',4,0.4,1),
('P02','Banana',6,0.6,2),
('P03','Salad',4,0.4,1),
('P04','Avocado',6,0.4,1),
('P05','Tomato',4,0.4,2);
psql postgres://$POSTGRES_USER:$POSTGRES_PWD@$POSTGRES_HOST/$POSTGRES_DB -a -f /home/dll/product.sql

Create containers

CREATE TABLE containers (
    container_id varchar(64) NOT NULL PRIMARY KEY,
    model varchar(5),
    brand varchar(20),
    status varchar(20),
    last_maintenance_date TIMESTAMP,
    capacity integer,
    latitude REAL,
    longitude REAL
);
INSERT INTO containers(container_id,model,brand,status,capacity) VALUES
('C01','20RF','Brand01',1,20),
('C02','45RW','Brand01',1,45),
('C03','20RF','Brand01',1,20),
('C04','40RH','Brand02',1,40),
('C05','40RH','Brand01',1,40);

Create orders

CREATE TABLE orders (
    order_id varchar(26) NOT NULL PRIMARY KEY,
    quantity integer, 
    creation_date TIMESTAMP,
    status integer,
    pickup_zipcode varchar(6),
    destination_zipcode varchar(6),
    container_id varchar(64) NULL,
    product_id varchar(64),
    customer_id varchar(8),
    foreign key (container_id) references containers(container_id),
    foreign key (product_id) references products(product_id),
    foreign key (customer_id) references customers(customer_id));

INSERT INTO ORDERS (order_id,quantity,status,creation_date,pickup_zipcode,destination_zipcode,container_id,product_id,customer_id) VALUES 
('O01',20,1,'2019-04-09 09:25:38.336','95050','34000','C01','P01','C01'),
('O02',40,1,'2019-04-09 09:25:38.336','95050','34000',NULL,'P03','C02'),
('O03',40,1,'2019-04-09 09:25:38.336','95050','34000','C02','P04','C03'),
('O04',40,1,'2019-04-09 09:25:38.336','95050','34000',NULL,'P04','C04')
;

Hibernate ORM

See code for order management

High Availability

For HA the principle is to try to get a second server to take over quickly in case of primary server failure, or to have several nodes to serve the same data. Multiple solutions exist to support HA, the synchronous one consider that a data-modifying transaction is not considered committed until all servers have committed the transaction.

  • shared disk (hardware) failover: avoids synchronization overhead by having only one copy of the database. standby server is able to mount and start the database as though it were recovering from a database crash
  • File system - block storage writes to the standby must be done in the same order as those on the primary.
  • Write-Ahead Log Shipping
  • Logical replication PostgreSQL logical replication constructs a stream of logical data modifications from the Write-Ahead Log. Logical replication allows replication of data changes on a per-table basis
  • Trigger-Based Primary-Standby Replication asynch replication from primary to standby servers
  • SQL-Based Replication Middleware: a program intercepts every SQL query and sends it to one or all servers
  • Synchronous Multimaster Replication each server accept write requests, modified data is transmitted from the original server to every other server before each transaction commits.
  • Asynchronous Multimaster Replication each server works independently, and periodically communicates with the other servers to identify conflicting transactions which can be resolved by DB admin.

Use python to interact with the DB

  • install psycopg2
pip install psycopg2

More readings

AWS Aurora

IBM Cloud

Create an instance of Postgresql on IBM Cloud

No need to reinvent the product documentation to create one instance. A summary of what needs to be done

  • Set user admin password:
ibmcloud cdb user-password <instance name> admin alongpassw0rd
  • Get URL to connect to the service: URL, Port number

  • Get SSL certificate for client using ibm cloud CLI

ibmcloud cdb deployment-cacert Green-DB-PostgreSQL > postgres.crt
  • Use pgAdmin4 to administer postgres: add a server and complete the ULR, port, admin as user and the password set in previous step.
docker run -p 5050:80 -d -e PGADMIN_DEFAULT_EMAIL=admin -e PGADMIN_DEFAULT_PASSWORD=alongpassw0rd dpage/pgadmin4
# http://localhost:5050/browser/