Friday, 28 May 2021

Convert SQL query output into HTML table | Python

FORMAT SQL QUERY OUTPUT INTO AN HTML TABLE 

From my own experience with python SQL connector. I think this is one of the easiest way to format the output SQL query output into HTML  table code.

Steps to convert SQL query output  into HTML table code:

  • Connect to the database using SQL connector
  • Create a cursor to execute your SQL query and fetch the resultant 
  • Execute the SQL query and use the below code to convert the output into an HTML table.
Convert  SQL query output into HTML table | Python
Convert  SQL query output into HTML table | Python

For example, I want to convert the above SQL query output into HTML code and I am using Snowflake Connector.

snowflake = snowflake.connector.connect(
  user=SF_USER,
  password=SF_PASSWORD,
  account=SF_ACCOUNT,
  warehouse=SF_WAREHOUSE,
  database=SF_DATABASE,
  schema=SF_SCHEMA
)    
snowflake_cur = snowflake.cursor()	

 sHtml=""
    
sql = """ 
            SELECT CustomerID, CustomerName FROM Customers ORDER BY CustomerName ASC;
          """
snowflake_cur.execute(sql)

sHtml=sHtml+"""</br><table order="1"><tr><th>CustomerID</th><th>CustomerName</th></tr>"""

for row in snowflake_cur.fetchall():
    sHtml=sHtml+"<tr><td>"+str(row[0])+"</td><td style=""text-align:center"" >"+str( "{:,}".format(row[1]))+"</td></tr>"

sHtml=sHtml+"</table></br>"


Read More :


Wednesday, 26 May 2021

Create a table from an existing table | SQL

CREATE A NEW TABLE FROM THE EXISTING TABLE  WITH DATA OR WITHOUT DATA ALONG WITH FEW COLUMNS OR ALL THE COLUMNS USING ONE TABLE OR MULTIPLE TABLES
These SQL queries will create the basic table and its fields but not indexes, constraints, or triggers as it is in the existing table.

CREATE AN EMPTY TABLE FROM AN EXISTING TABLE:

Syntax:
CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table WHERE 1=2); 

Example:
CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE 1=2);


CREATE A TABLE FROM AN EXISTING TABLE WITH DATA:

Syntax:
CREATE TABLE new_table 
  AS (SELECT * 
      FROM old_table); 

Example:
CREATE TABLE suppliers 
AS (SELECT * 
 FROM companies 
 WHERE id > 1000);


CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING TABLE WITH DATA:

Syntax:
CREATE TABLE new_table 
AS (SELECT column_1, column2, ... column_n 
FROM old_table); 

Example:
CREATE TABLE suppliers 
AS (SELECT id, address, city, state, zip 
FROM companies 
WHERE id > 1000); 


CREATE A TABLE WITH ONLY A FEW COLUMNS FROM AN EXISTING MULTIPLE TABLES WITH DATA:

Syntax:
CREATE TABLE new_table 
  AS (SELECT column_1, column2, ... column_n 
      FROM old_table_1, old_table_2, ... old_table_n); 

Example:
CREATE TABLE suppliers 
AS (SELECT companies.id, companies.address, categories.cat_type 
FROM companies, categories 
WHERE companies.id = categories.id 
AND companies.id > 1000); 


Read More:

  • What does count 1 mean in SQL?
  • How to include a single quote in a SQL query?

  • Types Of Dimension Tables with examples | Datawarehousing


    DIMENSION:

    The dimension table contains dimensions of fact.
    It contains keys and descriptive attributes for measurements in the fact table.


    TYPES OF DIMENSION:

    Conformed dimension:

    Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.

    Example: 

    The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

    Junk dimension:

    A junk dimension is a collection of random transactional codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.

    Example: 

    Assume that we have a gender dimension and marital status dimension. In the fact table, we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension that has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.

    Degenerate dimension:

    A degenerate dimension is a dimension that is derived from the fact table and doesn’t have its own dimension table.

    Example: 

    A transactional code in a fact table.


    Role-playing dimension:

    Dimensions that are often used for multiple purposes within the same database are called role-playing dimensions. 

    Example: 

    A date dimension can be used for “date of sale”, as well as “date of delivery”, or “date of hire”.

    SCD – Slowly Changing Dimension

    TYPE – 1:  Updates changes without preserving history.
    TYPE – 2:  Maintains history by storing the value as a new record.
    • Effective date
    • Flagging
    • Versioning

    TYPE – 3:  Maintains history as the current and previous 
    version column-wise.


    Read More :


    Tuesday, 25 May 2021

    Optimization in Informatica | Interview Question

    OPTIMIZING TARGET:

    • Defining constraints and indexes will slow down the loading of data. So drop indexes and key constraints.
    • Use bulk loading. It will load the data without writing database logs which makes it more efficient. 

    OPTIMIZING SOURCE:

    • Do not use ORDER BY or GROUP BY since it creates index on the source.
    • Use fast export reader instead of relation reader for source.

    OPTIMIZING MAPPING:

    • Add as much as filter whenever it is required.

    OPTIMIZING TRANSFORMATION:


    AGGREGATOR:

    1. Use sorted input option
    2. Use simple values for GROUP BY columns
    3. Filter data before using the aggregator
    4. Use incremental aggregation option

    JOINER:

    1. Design the master source with fewer duplicate key value
    2. Perform join in the database whenever possible
    3. Use sorted data 

    LOOKUP:

    1. If the lookup table database is the same as the source, then perform join in the database itself
    2. Filter the filter rows
    3. Optimize the lookup condition

    NORMALIZER:

    1. Place the normalizer close to the target.

    SEQUENCE GENERATOR:

    1. Make the transformation reusable

    SORTER:

    1. Allocate enough memory to sort the data

    SOURCE QUALIFIER:

    1. Add filter as much as possible
    2. Use distinct option



    Performance Tuning of Informatica mapping | Interview Question

     One of the most frequently asked Informatica interview question is about performance tuning. 

    • How do you optimize mapping?
    • How do you identify which part of mapping is taking more time to run?
    • How do you optimize a transformation in your mapping?
    • How do you improve the performance of a mapping?



    First of all, what is performance tuning?

    • To identify and resolve bottlenecks 
    • Performance bottleneck can occur at target, source, mapping, and session.
    In this post, we will see where the bottleneck is caused, how to identify and eliminate them.

    1) TARGET BOTTLENECK

    • Caused while Integration Service writes to a target database.

    How to identify Target Bottleneck?

    • Configure to write a flat-file. If the performance increases then you have a target bottleneck.
    • This means more time is spent on writer thread than transformation or reader thread.

    How to eliminate Target Bottleneck?

    • Configure index and key constraints on the target.

    2) SOURCE BOTTLENECK

    • Caused while reading from the source database.

    How to identify Source Bottleneck?

    • If the session reads from a flat file, then probably no source bottleneck.
    • If the source connection is relation then add the filter to the source, perform a read test.
    • This means more time is spent on the reader thread than transformation or writer thread.

    How to eliminate Source Bottleneck?

    • Configure index and key constraints on the source.

    3) MAPPING BOTTLENECK

    • Caused while transforming data.

    How to identify Mapping Bottleneck?

    • Check the session log to find how time is spent on the transformation thread.
    • This means more time is spent on the transformation thread than the reader or writer thread.

    How to eliminate Mapping Bottleneck?

    • By optimizing the transformation properties.

    4) SESSION BOTTLENECK

    • If the above bottlenecks are eliminated then you are probably facing the session bottleneck.

    How to identify Session Bottleneck?

    • Check the information about each transformation input, output, and error row details.

    How to eliminate Mapping Bottleneck?

    • By optimizing the session properties.

    How to include a single quote in a SQL query

    INSERT SINGLE QUOTE IN SQL


    This post explains how to insert a single quote in data/text/string.

    It can be inserted by placing the single quotes in between a pair of single quotes.

    This is compatible with all the databases like SnowflakeTeradata, Mysql, Postgresql, Oracle, PLSQL, Amazon Redshift.

    Below is an example of how to include/ insert a single quote in SQL.

    Insert single quote in a SQL data or text or string
    Insert a single quote in a SQL


    Read More :

    What does count 1 mean in SQL?

     There is a common misunderstanding that it interprets the ordinal of the column.

    COUNT(*) takes all columns to count rows which is equivalent to select * from tablename.

    COUNT(1) counts using the first column which is the Primary Key which is equivalent to select 1 from tablename, which returns all the records with the value “1” instead of fetching the actual value.

    So count(1) little efficient than count(*).

    Difference between Public & Private Subnet

    PUBLIC & PRIVATE SUBNET IN AMAZON  VIRTUAL PRIVATE CLOUD

    This post explains the difference between public and private subnet in Amazon (AWS) Virtual Private Cloud with a simple yet detailed explanation.

    The main difference between public and private subnet is how they can access the internet and who can access the instance in that subnet.


    We can access the internet only if we have a public IP address.

    Public subnet will have the public IP address which will route to an internet gateway.

    Private subnet will not have any public IP address, so this kind of traffic will be directed to NAT ( Network Address Translator) which provides the IP address to access the internet via an internet network gateway.
    And public subnet has both in and out internet access while from private subnet only has out internet access i.e. no one can access the instance in the private subnet from the internet.

    Monday, 24 May 2021

    Downloadable Python Paramiko Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Paramiko compatible for 2.7 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python XMLSchema Lambda Layer


    Download pre-build Python Lambda Layer deployment package of XMLSchema compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Gnupg Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Gnupg compatible for 2.7 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Snowflake Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Snowflake compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Fuzzywuzzy Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Fuzzywuzzy compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Downloadable Python Levenstein Lambda Layer


    Download pre-build Python Lambda Layer deployment package of Levenstein compatible for 3.6 and later runtime. 


    You can also create your own deployment package with the help of the below articles.



    Here are some other posts that might be helpful :


    Connect to Postgres database using AWS Secrets Manager and psycopg2


     
    This post explains how to retrieve values from AWS secret manager and use them to connect to the Postgres database.


    Requirements:

    Library:

    • Boto3
    • psycopg2
    Commands to install the required modules:

    pip install psycopg2
    pip install Boto3


    Required Connector Parameters:

    • Host
    • Port
    • Username
    • Password
    • Database

    Example Secret Manager: 


    Knowing the attribute of the secret manager value helps you to understand the code better.

    AW Secret Manager

    Code to connect to Postgres database using AWS Secrets Manager and psycopg2 :


    import boto3
    import json
    
    
    def get_rds_keys():
        try:
             region_name = 'us-west-2'
            session = boto3.session.Session()
    #Replace the string secret_name_rds with your secret manager name
            client = session.client(service_name='secretsmanager',region_name=region_name)
            get_secret_value_response = client.get_secret_value(SecretId=secret_name_rds)  
    
            if 'SecretString' in get_secret_value_response:
                secret = get_secret_value_response['SecretString']
                secret_val = secret
            else:
                binary_secret_data = get_secret_value_response['SecretBinary']
                secret_val=binary_secret_data
    
    # Every attribute name inside the secret_dict[''] should exact as it in the Secret Manager Key.
            secret_dict = eval(secret_val)
            RDS_USER=secret_dict['username']
            RDS_PASSWORD=secret_dict['password']
            RDS_HOST = secret_dict['host']
            RDS_PORT = secret_dict['port']
            RDS_DATABASE = secret_dict['database']
    
        except Exception as e:
            print('=== 1.1 ========= RDS Key Retrival - Error Occured : ', str(e))
        finally:
            return RDS_USER, RDS_PASSWORD, RDS_HOST, RDS_PORT, RDS_DATABASE
    
    #Retrives the values of the attributes from the above function.
    RDS_USER, RDS_PASSWORD, RDS_HOST, RDS_PORT, RDS_DATABASE  = get_rds_keys()
    
    # Connects to the database using the retrieved attribute values
    rdsData = psycopg2.connect(
        database=RDS_DATABASE,
        user=RDS_USER,
        password=RDS_PASSWORD,
        host=RDS_HOST,
        port=RDS_PORT
    )
    
    rds_cur = rdsData.cursor()
    
    rds_cur.execute(select * from table;)
    
    result = rds_cur.fetchall()
    
    print(result)