Tech Blogger: May 2021

Home » May 2021

Monday, 31 May 2021

Snowflake NVL vs NVL2 function | SQL

DIFFERENCE BETWEEN NVL AND NVL2 FUNCTION

This post explains the difference between NVL and NVL2 functions in Snowflake.

The NVL and NVL2 functions are also compatible with other databases like Teradata, Mysql, Postgresql, Oracle, PLSQL, Amazon Redshift.

NVL FUNCTION

The NVL function will return the default value if the in- parameter is NULL.

Syntax: NVL ( <in-param>,<out-param>)

If the in-param is NULL then it will return the out-param value.
If the in-param is NOT NULL then it will return the in-param value itself.

Difference between NVL and NVL2 function

NVL2 FUNCTION

The NVL2 function will return the first out-parameter if the in-parameter is NOT NULL and return the second parameter if the in-parameter is NULL.

Syntax: NVL2( <in-param>,<out-param-1>,<out-param-2>)

If the in-param is NULL then it will return the out-param-1 value .
If the in-param is NOT NULL then it will return the in-param-2 value itself.

Difference between NVL and NVL2 function

NVL VS NVL2 FUNCTION

Features	NVL	NVL2
Syntax	NVL ( <in-param>,<out-param>)	NVL2( <in-param> ,<out-param-1> ,<out-param-2>)
Example	[1] NVL(NULL,'It is NULL') [2] NVL('99','It is NULL')	[1] NVL2('99','It is Not NULL' ,'It is NULL')
Number of IN parameter	1	1
Number of OUT parameter	1	2
If IN parameter is NULL	Returns <out-param> i.e [1] 'It is NULL'	Returns <out-param-1> i.e [1] 'It is Not NULL'
If IN parameter is NOT NULL	Returns <in-param> i.e [2] '99'	Returns <out-param-2> i.e [1] 'It is NULL'

NVL vs NVL2 function | SQL

Read More :

What does count 1 mean in SQL?

How to include a single quote in a SQL query?

Difference between WHERE and HAVING clause

Create a table from an existing table

Sunday, 30 May 2021

Difference between WHERE and HAVING clause | SQL

WHERE AND HAVING CLAUSE WITH EXAMPLE

This post explains the difference between WHERE and HAVING clauses and how to use them with example scenarios.

WHERE:

WHERE clause is used to filter only the desired records in the table.
Records that don't meet the condition are removed prior to the grouping operation.

HAVING:

HAVING clause is used as a secondary filter. If the value is calculated in your query and the value is not readily available in your table then you need use the HAVING clause.
This filter is applied after your aggregate functions are executed in order to meet the mentioned HAVING condition from your resultant.

For example, let's consider you have a data set with ticket sales details of a game by team.

Scenario 1 :

The resultant here will be each team’s sum of sales from every game that had sales greater than 50,000.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
WHERE 
sales_amt > 50000 
GROUP BY 
team;

Scenario 2 :

The resultant here will be only teams that have total sales of greater than 5,00,000.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
GROUP BY 
team
HAVING 
SUM(sales_amt ) > 500000;

Scenario 3 :

The resultant here will be only teams that have grossed more than 5,00,000 from games that did more than 50,000 in sales.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
WHERE 
sales_amt > 50000 
GROUP BY 
team
HAVING 
SUM(sales_amt ) > 500000;

WHERE VS HAVING:

Features	WHERE	HAVING
Syntax	WHERE condition;	GROUP BY columns HAVING condition;
Example	WHERE sales_amt > 50000;	GROUP BY team HAVING SUM(sales_amt ) > 500000;
What value can be used condition	Direct value from table	Derived aggregate value from table
When to use WHERE/HAVING clause	To filter data based on values available in the table.	To filter data based on aggregate values derived from values available in the table

What does count 1 mean in SQL?

How to include a single quote in a SQL query?

Create a table from an existing table

Snowflake NVL vs NVL2

Friday, 28 May 2021

Convert SQL query output into HTML table | Python

FORMAT SQL QUERY OUTPUT INTO AN HTML TABLE

From my own experience with python SQL connector. I think this is one of the easiest way to format the output SQL query output into HTML table code.

Steps to convert SQL query output into HTML table code:

Connect to the database using SQL connector
Create a cursor to execute your SQL query and fetch the resultant
Execute the SQL query and use the below code to convert the output into an HTML table.


Convert  SQL query output into HTML table | Python

For example, I want to convert the above SQL query output into HTML code and I am using Snowflake Connector.

snowflake = snowflake.connector.connect(
  user=SF_USER,
  password=SF_PASSWORD,
  account=SF_ACCOUNT,
  warehouse=SF_WAREHOUSE,
  database=SF_DATABASE,
  schema=SF_SCHEMA
)    
snowflake_cur = snowflake.cursor()	

 sHtml=""
    
sql = """ 
            SELECT CustomerID, CustomerName FROM Customers ORDER BY CustomerName ASC;
          """
snowflake_cur.execute(sql)

sHtml=sHtml+"""</br><table order="1"><tr><th>CustomerID</th><th>CustomerName</th></tr>"""

for row in snowflake_cur.fetchall():
    sHtml=sHtml+"<tr><td>"+str(row[0])+"</td><td style=""text-align:center"" >"+str( "{:,}".format(row[1]))+"</td></tr>"

sHtml=sHtml+"</table></br>"

Connect to Postgres database using AWS Secrets Manager and psycopg2

Wednesday, 26 May 2021

Create a table from an existing table | SQL

CREATE A NEW TABLE FROM THE EXISTING TABLE WITH DATA OR WITHOUT DATA ALONG WITH FEW COLUMNS OR ALL THE COLUMNS USING ONE TABLE OR MULTIPLE TABLES

These SQL queries will create the basic table and its fields but not indexes, constraints, or triggers as it is in the existing table.

CREATE AN EMPTY TABLE FROM AN EXISTING TABLE:

Syntax:

CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table WHERE 1=2);

Example:

CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE 1=2);

CREATE A TABLE FROM AN EXISTING TABLE WITH DATA:

Syntax:

CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table);

Example:

CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE id > 1000);

CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING TABLE WITH DATA:

Syntax:

CREATE TABLE new_table 
AS (SELECT column_1, column2, ... column_n 
FROM old_table);

Example:

CREATE TABLE suppliers 
AS (SELECT id, address, city, state, zip 
FROM companies 
WHERE id > 1000);

CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING MULTIPLE TABLES WITH DATA:

Syntax:

CREATE TABLE new_table 
  AS (SELECT column_1, column2, ... column_n 
      FROM old_table_1, old_table_2, ... old_table_n);

Example:

CREATE TABLE suppliers 
AS (SELECT companies.id, companies.address, categories.cat_type 
FROM companies, categories

WHERE companies.id = categories.id 
AND companies.id > 1000);

What does count 1 mean in SQL?

How to include a single quote in a SQL query?

Types Of Dimension Tables with examples | Datawarehousing

DIMENSION:

The dimension table contains dimensions of fact.

It contains keys and descriptive attributes for measurements in the fact table.

TYPES OF DIMENSION:

Conformed dimension:

Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.

Example:

The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

Junk dimension:

A junk dimension is a collection of random transactional codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.

Example:

Assume that we have a gender dimension and marital status dimension. In the fact table, we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension that has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.

Degenerate dimension:

A degenerate dimension is a dimension that is derived from the fact table and doesn’t have its own dimension table.

Example:

A transactional code in a fact table.

Role-playing dimension:

Dimensions that are often used for multiple purposes within the same database are called role-playing dimensions.

Example:

A date dimension can be used for “date of sale”, as well as “date of delivery”, or “date of hire”.

SCD – Slowly Changing Dimension

TYPE – 1: Updates changes without preserving history.

TYPE – 2: Maintains history by storing the value as a new record.

Effective date
Flagging
Versioning

TYPE – 3: Maintains history as the current and previous

version column-wise.

Tuesday, 25 May 2021

Optimization in Informatica | Interview Question

OPTIMIZING TARGET:

Defining constraints and indexes will slow down the loading of data. So drop indexes and key constraints.
Use bulk loading. It will load the data without writing database logs which makes it more efficient.

OPTIMIZING SOURCE:

Do not use ORDER BY or GROUP BY since it creates index on the source.
Use fast export reader instead of relation reader for source.

OPTIMIZING MAPPING:

Add as much as filter whenever it is required.

OPTIMIZING TRANSFORMATION:

AGGREGATOR:

Use sorted input option
Use simple values for GROUP BY columns
Filter data before using the aggregator
Use incremental aggregation option

JOINER:

Design the master source with fewer duplicate key value
Perform join in the database whenever possible
Use sorted data

LOOKUP:

If the lookup table database is the same as the source, then perform join in the database itself
Filter the filter rows
Optimize the lookup condition

NORMALIZER:

Place the normalizer close to the target.

SEQUENCE GENERATOR:

Make the transformation reusable

SORTER:

Allocate enough memory to sort the data

SOURCE QUALIFIER:

Add filter as much as possible
Use distinct option

Performance Tuning of Informatica mapping | Interview Question

One of the most frequently asked Informatica interview question is about performance tuning.

How do you optimize mapping?
How do you identify which part of mapping is taking more time to run?
How do you optimize a transformation in your mapping?
How do you improve the performance of a mapping?

First of all, what is performance tuning?

To identify and resolve bottlenecks
Performance bottleneck can occur at target, source, mapping, and session.

In this post, we will see where the bottleneck is caused, how to identify and eliminate them.

1) TARGET BOTTLENECK

Caused while Integration Service writes to a target database.

How to identify Target Bottleneck?

Configure to write a flat-file. If the performance increases then you have a target bottleneck.

This means more time is spent on writer thread than transformation or reader thread.

How to eliminate Target Bottleneck?

Configure index and key constraints on the target.

2) SOURCE BOTTLENECK

Caused while reading from the source database.

How to identify Source Bottleneck?

If the session reads from a flat file, then probably no source bottleneck.

If the source connection is relation then add the filter to the source, perform a read test.

This means more time is spent on the reader thread than transformation or writer thread.

How to eliminate Source Bottleneck?

Configure index and key constraints on the source.

3) MAPPING BOTTLENECK

Caused while transforming data.

How to identify Mapping Bottleneck?

Check the session log to find how time is spent on the transformation thread.

This means more time is spent on the transformation thread than the reader or writer thread.

How to eliminate Mapping Bottleneck?

By optimizing the transformation properties.

4) SESSION BOTTLENECK

If the above bottlenecks are eliminated then you are probably facing the session bottleneck.

How to identify Session Bottleneck?

Check the information about each transformation input, output, and error row details.

How to eliminate Mapping Bottleneck?

By optimizing the session properties.

How to include a single quote in a SQL query

INSERT SINGLE QUOTE IN SQL

This post explains how to insert a single quote in data/text/string.

It can be inserted by placing the single quotes in between a pair of single quotes.

This is compatible with all the databases like Snowflake, Teradata, Mysql, Postgresql, Oracle, PLSQL, Amazon Redshift.

Below is an example of how to include/ insert a single quote in SQL.

Insert single quote in a SQL data or text or string

Insert a single quote in a SQL

Read More :

What does count 1 mean in SQL?

Difference between WHERE and HAVING clause

Create a table from an existing table

Snowflake NVL vs NVL2

What does count 1 mean in SQL?

There is a common misunderstanding that it interprets the ordinal of the column.

COUNT(*) takes all columns to count rows which is equivalent to select * from tablename.

COUNT(1) counts using the first column which is the Primary Key which is equivalent to select 1 from tablename, which returns all the records with the value “1” instead of fetching the actual value.

So count(1) little efficient than count(*).

Difference between Public & Private Subnet

PUBLIC & PRIVATE SUBNET IN AMAZON VIRTUAL PRIVATE CLOUD

This post explains the difference between public and private subnet in Amazon (AWS) Virtual Private Cloud with a simple yet detailed explanation.

The main difference between public and private subnet is how they can access the internet and who can access the instance in that subnet.

We can access the internet only if we have a public IP address.

Public subnet will have the public IP address which will route to an internet gateway.

Private subnet will not have any public IP address, so this kind of traffic will be directed to NAT ( Network Address Translator) which provides the IP address to access the internet via an internet network gateway.
And public subnet has both in and out internet access while from private subnet only has out internet access i.e. no one can access the instance in the private subnet from the internet.

Monday, 24 May 2021

Downloadable Python Paramiko Lambda Layer

Download pre-build Python Lambda Layer deployment package of Paramiko compatible for 2.7 and later runtime.