Data Load to Gold layer (PostgreSQL to PostgreSQL)
The data load from the Silver layer to the Gold layer is managed through parameterized ETL jobs. These jobs are designed to follow a controlled and auditable execution process.
To facilitate this, we maintain three core tables:
etl_parameter_lookup - parameter table, for sequential job execution
master_etl_audit_tbl – As audit master
main_etl_audit_tbl- As the audit table tracks the load of each table
etl_parameter_lookup
This parameter table maintains a list of Gold Layer tables scheduled for loading. The ETL job reads from this table and executes the loads sequentially, following the specified order. Separate jobs are configured for dimension tables, fact tables, and aggregate tables, ensuring structured and organized data processing.
Column Name
Data Type
Nullable
id
serial4
NO
gold_layer_table_name
varchar(100)
NO
gold_layer_table_schema
varchar(50)
NO
gold_layer_table_type
varchar(50)
NO
entity_description
varchar(255)
YES
load_sequence
int4
NO
load_type
varchar(50)
NO
extract_key
varchar(225)
YES
load_key
varchar(225)
YES
dependency_tables
text
YES
full_load_query
text
NO
incremental_load_query
text
YES
parameter_format
varchar(50)
YES
db_connection_type
varchar(100)
YES
active
bool
YES
created_at
timestamp
YES
last_updated
timestamp
YES

main_etl_audit_tbl
This table captures detailed audit information for each table load during a given ETL run. It logs metadata such as the start time, end time, row counts, and load status (success or failure).The load mechanism is incremental and fault-tolerant:
Each table’s next load is determined based on the etl_end_time of the most recent successful run.
If a table load fails in a previous cycle, the subsequent job will automatically resume or retry from the last valid etl_end_time.
This ensures no data loss and enables efficient error recovery in a multi-table pipeline.
Column Name
Data Type
Nullable
id
serial4
NO
process_id
int4
YES
data_source
varchar(50)
YES
source_table_name
_text
YES
destination_table_name
varchar(100)
YES
operation
varchar(50)
YES
time_from
timestamp
YES
time_to
timestamp
YES
etl_start_time
timestamp
YES
etl_end_time
timestamp
YES
input_row
int4
YES
output_row
int4
YES
operation_result
varchar(50)
YES
errorcode
varchar(50)
YES
result
text
YES
master_etl_audit_tbl
This table serves as the master audit log for each ETL job execution cycle. For example, if the ETL process runs hourly, a single entry is recorded in this table for each hourly load, summarizing the overall status of that run.
Column Name
Data Type
Nullable
process_id
serial4
NO
schedule_name
varchar(100)
NO
schedule_timestamp
timestamp
NO
status
varchar(20)
NO
start_time
timestamp
NO
end_time
timestamp
YES
tables_processed
int4
NO
tables_failed
int4
NO
tables_successful
int4
NO
error_message
text
YES

The gold layer was designed with a focus on optimizing reporting performance. Entities in the gold layer were organized into dimension tables, fact tables, and aggregate tables to support efficient analytical queries.
Date and DateTime Dimension Tables:
In addition to the entities sourced from the customer system, a date dimension table and a datetime dimension table were introduced.
These tables are designed to support custom formatting of date and time fields, enable date-time-based analysis, facilitate joining fact tables to calendar attributes, and simplify reporting. The metadata for these tables is provided below. They can be customized to store dates in any format required across the solution.
These tables help eliminate the need for complex date formatting functions in reporting queries. For example, instead of using expressions like:
which generates a formatted output like "Week 1-2022" from the date 2022-01-06
which generates a "2022-01" format for monthly aggregations, the required format can be precomputed and stored in the date dimension table.
This approach improves query readability and performance while ensuring consistency across reports.
Table: date_dim
Field Name
Data Type
Field Description
date_id
int
e.g., 20250506 (YYYYMMDD)
full_date
date
Calendar date
day_of_week
int
ISO day of week (1 = Monday, 7 = Sunday)
day_name
varchar(10)
Name of the day (e.g., "Monday")
day_of_month
int
Day of the month (1–31)
day_of_year
int
Day number in the year (1–365/366)
week_of_year
int
ISO week number (1–53)
month_number
int
Month number (1–12)
month_name
varchar(10)
Name of the month (e.g., "January")
quarter_number
int
Quarter number (1–4)
year
int
Calendar year
fiscal_period
varchar(20)
Fiscal period (e.g., "FY2025-Q2")
weekly_label
varchar(20)
Label for weekly granularity (e.g., "Week 24-2025")
monthly_label
varchar(20)
Label for monthly granularity (e.g., "June 2025")
quarterly_label
varchar(20)
Label for quarterly granularity (e.g., "Q2 2025")
record_last_updated
timestamp
Datetime when the entry was made
Table: datetime_dim
Field Name
Data Type
Field Description
datetime_id
int
e.g., 2025050613 (YYYYMMDDHH)
datetime_val
datetime
Actual timestamp at hour-level granularity
date_val
date
date portion of the datetime
hour_val
int
0 to 23
day_of_week
varchar(20)
Monday, Tuesday, etc.
day_of_week_num
int
1 (Monday) to 7 (Sunday)
is_weekend
boolean
TRUE if Saturday or Sunday
is_holiday
boolean
TRUE if it’s a holiday (if you maintain a calendar)
week_number
int
ISO week number
month
int
Month number (1–12)
month_name
varchar(20)
January, February, etc.
quarter
int
1 to 4
year
int
Year portion
weekly_label
varchar(20)
Label for weekly granularity (e.g., "Week 24-2025")
monthly_label
varchar(20)
Label for monthly granularity (e.g., "June 2025")
quarterly_label
varchar(20)
Label for quarterly granularity (e.g., "Q2 2025")
Last updated