Monday, 31 May 2021

Snowflake NVL vs NVL2 function | SQL

DIFFERENCE BETWEEN  NVL AND NVL2 FUNCTION


 
This post explains the difference between NVL and NVL2 functions in Snowflake.

The NVL and NVL2 functions are also compatible with other databases like Teradata, Mysql, Postgresql, Oracle, PLSQL, Amazon Redshift.



NVL FUNCTION


The NVL function will return the default value if the in- parameter is NULL.

Syntax: NVL ( <in-param>,<out-param>)
  • If the in-param is NULL then it will return the out-param value.
  • If the in-param is NOT NULL then it will return the in-param value itself.

Difference between NVL and NVL2 function | Snowflake
Difference between NVL and NVL2 function


NVL2 FUNCTION


The NVL2 function will return the first out-parameter if the in-parameter is NOT NULL and return the second parameter if the in-parameter is NULL.


Syntax: NVL2( <in-param>,<out-param-1>,<out-param-2>)

  • If the in-param is NULL then it will return the out-param-1 value .
  • If the in-param is NOT NULL then it will return the in-param-2 value itself.

Difference between NVL and NVL2 function | Snowflake
Difference between NVL and NVL2 function

NVL VS NVL2 FUNCTION


Features

NVL

NVL2

Syntax

NVL ( <in-param>,<out-param>)

NVL2( <in-param>
,<out-param-1>
,<out-param-2>)

Example

[1] NVL(NULL,'It is NULL')
[2] NVL('99','It is NULL')

[1] NVL2('99','It is Not NULL'
,'It is NULL')

Number of
IN parameter

1

1

Number of
OUT parameter

1

2

If IN parameter is NULL

Returns <out-param>
i.e
[1] 'It is NULL'

Returns <out-param-1>
 i.e
[1] 'It is Not NULL'

If IN parameter is NOT NULL

Returns  <in-param>
 i.e
[2] '99'

Returns <out-param-2>
 i.e
[1] 'It is NULL'

 

NVL vs NVL2 function | SQL
NVL vs NVL2 function | SQL


Read More :

Sunday, 30 May 2021

Difference between WHERE and HAVING clause | SQL

WHERE AND HAVING CLAUSE WITH EXAMPLE


This post explains the difference between WHERE and HAVING clauses and how to use them with example scenarios.


WHERE:

  • WHERE clause is used to filter only the desired records in the table.
  • Records that don't meet the condition are removed prior to the grouping operation.

HAVING:

  • HAVING clause is used as a secondary filter. If the value is calculated in your query and the value is not readily available in your table then you need use the HAVING clause.
  • This filter is applied after your aggregate functions are executed in order to meet the mentioned HAVING condition from your resultant.
For example, let's consider you have a data set with ticket sales details of a game by team.

Scenario 1 :
The resultant here will be each team’s sum of sales from every game that had sales greater than 50,000.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
WHERE 
sales_amt > 50000 
GROUP BY 
team;

Scenario 2 :
The resultant here will be only teams that have total sales of greater than 5,00,000.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
GROUP BY 
team
HAVING 
SUM(sales_amt ) > 500000;

Scenario 3 :
The resultant here will be only teams that have grossed more than 5,00,000 from games that did more than 50,000 in sales.

SELECT 
team, 
SUM(sales_amt ) 
FROM 
games 
WHERE 
sales_amt > 50000 
GROUP BY 
team
HAVING 
SUM(sales_amt ) > 500000;

WHERE VS HAVING:

Features

WHERE

HAVING

Syntax

WHERE
condition;

GROUP BY
columns
HAVING
condition;

Example

WHERE
sales_amt > 50000;

GROUP BY
team
HAVING
SUM(sales_amt ) > 500000;

What value can be used condition

Direct value from table

Derived  aggregate value from table

When to use
WHERE/HAVING clause

To filter data based on values available in the table.

To filter data based on aggregate values derived from values available in the table



Read More:


Friday, 28 May 2021

Convert SQL query output into HTML table | Python

FORMAT SQL QUERY OUTPUT INTO AN HTML TABLE 

From my own experience with python SQL connector. I think this is one of the easiest way to format the output SQL query output into HTML  table code.

Steps to convert SQL query output  into HTML table code:

  • Connect to the database using SQL connector
  • Create a cursor to execute your SQL query and fetch the resultant 
  • Execute the SQL query and use the below code to convert the output into an HTML table.
Convert  SQL query output into HTML table | Python
Convert  SQL query output into HTML table | Python

For example, I want to convert the above SQL query output into HTML code and I am using Snowflake Connector.

snowflake = snowflake.connector.connect(
  user=SF_USER,
  password=SF_PASSWORD,
  account=SF_ACCOUNT,
  warehouse=SF_WAREHOUSE,
  database=SF_DATABASE,
  schema=SF_SCHEMA
)    
snowflake_cur = snowflake.cursor()	

 sHtml=""
    
sql = """ 
            SELECT CustomerID, CustomerName FROM Customers ORDER BY CustomerName ASC;
          """
snowflake_cur.execute(sql)

sHtml=sHtml+"""</br><table order="1"><tr><th>CustomerID</th><th>CustomerName</th></tr>"""

for row in snowflake_cur.fetchall():
    sHtml=sHtml+"<tr><td>"+str(row[0])+"</td><td style=""text-align:center"" >"+str( "{:,}".format(row[1]))+"</td></tr>"

sHtml=sHtml+"</table></br>"


Read More :


Wednesday, 26 May 2021

Create a table from an existing table | SQL

CREATE A NEW TABLE FROM THE EXISTING TABLE  WITH DATA OR WITHOUT DATA ALONG WITH FEW COLUMNS OR ALL THE COLUMNS USING ONE TABLE OR MULTIPLE TABLES
These SQL queries will create the basic table and its fields but not indexes, constraints, or triggers as it is in the existing table.

CREATE AN EMPTY TABLE FROM AN EXISTING TABLE:

Syntax:
CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table WHERE 1=2); 

Example:
CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE 1=2);


CREATE A TABLE FROM AN EXISTING TABLE WITH DATA:

Syntax:
CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table); 

Example:
CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE id > 1000);


CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING TABLE WITH DATA:

Syntax:
CREATE TABLE new_table 
AS (SELECT column_1, column2, ... column_n 
FROM old_table); 

Example:
CREATE TABLE suppliers 
AS (SELECT id, address, city, state, zip 
FROM companies 
WHERE id > 1000); 


CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING MULTIPLE TABLES WITH DATA:

Syntax:
CREATE TABLE new_table 
  AS (SELECT column_1, column2, ... column_n 
      FROM old_table_1, old_table_2, ... old_table_n); 

Example:
CREATE TABLE suppliers 
AS (SELECT companies.id, companies.address, categories.cat_type 
FROM companies, categories 
WHERE companies.id = categories.id 
AND companies.id > 1000); 


Read More:

  • What does count 1 mean in SQL?
  • How to include a single quote in a SQL query?

  • Types Of Dimension Tables with examples | Datawarehousing


    DIMENSION:

    The dimension table contains dimensions of fact.
    It contains keys and descriptive attributes for measurements in the fact table.


    TYPES OF DIMENSION:

    Conformed dimension:

    Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.

    Example: 

    The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

    Junk dimension:

    A junk dimension is a collection of random transactional codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.

    Example: 

    Assume that we have a gender dimension and marital status dimension. In the fact table, we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension that has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.

    Degenerate dimension:

    A degenerate dimension is a dimension that is derived from the fact table and doesn’t have its own dimension table.

    Example: 

    A transactional code in a fact table.


    Role-playing dimension:

    Dimensions that are often used for multiple purposes within the same database are called role-playing dimensions. 

    Example: 

    A date dimension can be used for “date of sale”, as well as “date of delivery”, or “date of hire”.

    SCD – Slowly Changing Dimension

    TYPE – 1:  Updates changes without preserving history.
    TYPE – 2:  Maintains history by storing the value as a new record.
    • Effective date
    • Flagging
    • Versioning

    TYPE – 3:  Maintains history as the current and previous 
    version column-wise.


    Read More :


    Tuesday, 25 May 2021

    Optimization in Informatica | Interview Question

    OPTIMIZING TARGET:

    • Defining constraints and indexes will slow down the loading of data. So drop indexes and key constraints.
    • Use bulk loading. It will load the data without writing database logs which makes it more efficient. 

    OPTIMIZING SOURCE:

    • Do not use ORDER BY or GROUP BY since it creates index on the source.
    • Use fast export reader instead of relation reader for source.

    OPTIMIZING MAPPING:

    • Add as much as filter whenever it is required.

    OPTIMIZING TRANSFORMATION:


    AGGREGATOR:

    1. Use sorted input option
    2. Use simple values for GROUP BY columns
    3. Filter data before using the aggregator
    4. Use incremental aggregation option

    JOINER:

    1. Design the master source with fewer duplicate key value
    2. Perform join in the database whenever possible
    3. Use sorted data 

    LOOKUP:

    1. If the lookup table database is the same as the source, then perform join in the database itself
    2. Filter the filter rows
    3. Optimize the lookup condition

    NORMALIZER:

    1. Place the normalizer close to the target.

    SEQUENCE GENERATOR:

    1. Make the transformation reusable

    SORTER:

    1. Allocate enough memory to sort the data

    SOURCE QUALIFIER:

    1. Add filter as much as possible
    2. Use distinct option



    Performance Tuning of Informatica mapping | Interview Question

     One of the most frequently asked Informatica interview question is about performance tuning. 

    • How do you optimize mapping?
    • How do you identify which part of mapping is taking more time to run?
    • How do you optimize a transformation in your mapping?
    • How do you improve the performance of a mapping?



    First of all, what is performance tuning?

    • To identify and resolve bottlenecks 
    • Performance bottleneck can occur at target, source, mapping, and session.
    In this post, we will see where the bottleneck is caused, how to identify and eliminate them.

    1) TARGET BOTTLENECK

    • Caused while Integration Service writes to a target database.

    How to identify Target Bottleneck?

    • Configure to write a flat-file. If the performance increases then you have a target bottleneck.
    • This means more time is spent on writer thread than transformation or reader thread.

    How to eliminate Target Bottleneck?

    • Configure index and key constraints on the target.

    2) SOURCE BOTTLENECK

    • Caused while reading from the source database.

    How to identify Source Bottleneck?

    • If the session reads from a flat file, then probably no source bottleneck.
    • If the source connection is relation then add the filter to the source, perform a read test.
    • This means more time is spent on the reader thread than transformation or writer thread.

    How to eliminate Source Bottleneck?

    • Configure index and key constraints on the source.

    3) MAPPING BOTTLENECK

    • Caused while transforming data.

    How to identify Mapping Bottleneck?

    • Check the session log to find how time is spent on the transformation thread.
    • This means more time is spent on the transformation thread than the reader or writer thread.

    How to eliminate Mapping Bottleneck?

    • By optimizing the transformation properties.

    4) SESSION BOTTLENECK

    • If the above bottlenecks are eliminated then you are probably facing the session bottleneck.

    How to identify Session Bottleneck?

    • Check the information about each transformation input, output, and error row details.

    How to eliminate Mapping Bottleneck?

    • By optimizing the session properties.

    How to include a single quote in a SQL query

    INSERT SINGLE QUOTE IN SQL


    This post explains how to insert a single quote in data/text/string.

    It can be inserted by placing the single quotes in between a pair of single quotes.

    This is compatible with all the databases like SnowflakeTeradata, Mysql, Postgresql, Oracle, PLSQL, Amazon Redshift.

    Below is an example of how to include/ insert a single quote in SQL.

    Insert single quote in a SQL data or text or string
    Insert a single quote in a SQL


    Read More :

    What does count 1 mean in SQL?

     There is a common misunderstanding that it interprets the ordinal of the column.

    COUNT(*) takes all columns to count rows which is equivalent to select * from tablename.

    COUNT(1) counts using the first column which is the Primary Key which is equivalent to select 1 from tablename, which returns all the records with the value “1” instead of fetching the actual value.

    So count(1) little efficient than count(*).

    Difference between Public & Private Subnet

    PUBLIC & PRIVATE SUBNET IN AMAZON  VIRTUAL PRIVATE CLOUD

    This post explains the difference between public and private subnet in Amazon (AWS) Virtual Private Cloud with a simple yet detailed explanation.

    The main difference between public and private subnet is how they can access the internet and who can access the instance in that subnet.


    We can access the internet only if we have a public IP address.

    Public subnet will have the public IP address which will route to an internet gateway.

    Private subnet will not have any public IP address, so this kind of traffic will be directed to NAT ( Network Address Translator) which provides the IP address to access the internet via an internet network gateway.
    And public subnet has both in and out internet access while from private subnet only has out internet access i.e. no one can access the instance in the private subnet from the internet.

    Monday, 24 May 2021

    Downloadable Python Paramiko Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Paramiko compatible for 2.7 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python XMLSchema Lambda Layer


    Download pre-build Python Lambda Layer deployment package of XMLSchema compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Gnupg Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Gnupg compatible for 2.7 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Snowflake Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Snowflake compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Fuzzywuzzy Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Fuzzywuzzy compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :