This article was published as a part of the Data Science Blogathon.
SQL, Structured Query Language, iational databases. SQL gets used by data scientists, analysts, web developers, and other tech geeks for data exploration and manipulation.
This query language provides various methods to
cover SQL-Join methods and techniques to develop and use procedures and functions.
So, let’s start the learning in the following order:
- Inner Join
- Left Join
- Right Join
- Full Join
- Cross Join
- Self Join
To undncepts, we will use the following customers and transactions tables from the Sales database:
Join in SQL gets used for combining rows from two or more tables given a related or common column. Joins are a great way to handle tables with one-to-many and many-to-many relationships. The join method takes the table’s order into account. The table that occurs before ‘join’ in the query is the left table, and the other is the right table.
A default Join method performs with either the ‘Join’ or ‘Inner Join’ clause. This method combined only those records from both the tables which satisfy the condition. The below Venn diagram also illustrates the same to clear the picture of the inner join.
Now, performing the inner join on the given tables:
<SELECT customers.customer_code, customers.custmer_name, transactions.product_code, transactions.sales_amount from customers JOIN transactions on customers.customer_code=transactions.customer_code;>
Now let’s break down the above query:
The select clause will retrieve the rows from the columns, customer code, customer name from the ‘customers’ table, and product code with sales amount from the ‘transactions’ table. Both the tables have one related column: customer_code, which is to join these tables.
The output of the above query is as shown below:
This clause returns all the rows from the left table and the records satisfying the condition from the right table. If any row from the right table does not have any data, it returns NULL for that particular condition.
Let’s implement the left join using the above two tables:
<SELECT customers.customer_code, customers.custmer_name, transactions.product_code, transactions.sales_amount from customers Left JOIN transactions on customers.customer_code=transactions.customer_code;>
The output for the above query is:
Right Outer Join (or Right Join) works exactly opposite the left join. It returns all rows from the right table and returns null in the rows of the left table where the condition is not getting fulfilled. The Right join is to join more than two tables since it can avoid restructuring the query to join one table.
Let’s implement the right join in the workbench:
<SELECT customers.customer_code, customers.custmer_name, transactions.product_code, transactions.sales_amount from customers Right JOIN transactions on customers.customer_code=transactions.customer_code;>
The output of the above query is as follows:
A Full Outer Join will combine all rows or records from both sides of the join. The below Venn diagram illustrates the full join output.
SQLite does not support Full outer and Right join. We rarely use Right join and Full join in daily practice.
SELECT customers.customer_code, customers.custmer_name, transactions.product_code, transactions.sales_amount from customers FULL JOIN transactions on customers.customer_code=transactions.customer_code;
Cross Join returns the cartesian products of the tables. It multiplies the number of rows in customers by the number of rows in transactions.
The query for cross join is as follows:
<SELECT customers.customer_code, customers.custmer_name, transactions.product_code, transactions.sales_amount from customers Cross JOIN transactions on customers.customer_code=transactions.customer_code;>
The above Cross join query returns the table with the multiplication of rows from both the tables, as shown below:
Self Join combines the table with itself. We can use self-join to create new columns using the existing column in the same table.
The query for the Self Join is:
select trans.customer_code, trans.product_code, trans.sales_amount, new_trans.sales_amount+1000 as new_amount from transactions as trans, transactions as new_trans where trans.customer_code = new_trans.customer_code;
Procedures and Function
Procedure and function are the set of statements that execute a particular task. The significant difference between both is: that a function must return a value always, while a procedure may not or may yield.
To avoid repetitiveness of the same query and redundancy, we use procedures.
A procedure refers to a block of SQL statements to perform some specific tasks. It contains a header for the procedure’s name with parameters and a body for the execution query section.
For creating a procedure, go to ‘stored procedure’ and click on ‘create a stored procedure.’ Then the workbench will automatically generate a procedure syntax. We need to name the procedure, pass the parameters, and write the statement.
Then after applying, we need to call the procedure using the ‘call ().’
Here we create a simple procedure to print the customer’s name from the ‘customers’ table.
Function in SQL must return a value whenever it is getting called. So we use it to store the statement for returning a value that is needed multiple times during our work. It saves time by using parameters and queries at once while defining the function.
To create a function, go to ‘functions’ then ‘create function.’
It will create a function syntax for us, and then we need to pass the parameters and conditions. Then call the function to check the products from the transactions table to buy shown in the figure below.
In this article, we have discussed various SQL-Join methods such as Inner join, left join, right join, full join, cross join, and self join. Also, we have discussed procedures and functions in SQL along with their major difference and implementation. Now, let’s break down some of the key takeaways of the article:
- SQL Joins are used for combining records from two or more tables.
- Inner Join returns the rows which are common in both tables.
- Left Join returns all rows from the left table and only common rows from the right table.
- Right Join returns all rows from the right table and only common rows from the left table.
- Full Outer Join returns all the rows from both tables which satisfies the condition.
- Full Join and Right Join are generally used for more than two tables.
- Cross Join returns the cartesian product of both tables.
- Self Join combines the rows of the same table with itself.
- SQL-Procedures are used for saving time and removing the redundancy of writing the same query again and again.
- SQL-Procedures may or may not return a value while a function in SQL must return a value.
- SQL-Functions are used where we need some value multiple times hence instead of writing the same query for different columns multiple times it is better to define a function and call it to retrieve that value.
Thank you for reading this article on SQL-Joins. And if you like it and find it informative. Then share it with your other SQL geeks. More learnings are on the way, so stay tuned and keep learning!!
Bries or feedback, feel free to reach me on Linkedin.
- SQL-JOINS Image: https://stackoverflow.com/questions/13997365/sql-joins-as-venn-diagram
- Venn-diagrams: https://www.dataquest.io/blog/sql-joins-tutorial/
- SQL-JOINS: https://www.w3schools.com/sql/sql_join.asp
- Cross and Self Join: https://csveda.com/natural-join-cross-join-and-self-join-in-sql-with-examples/
- Dataset link: https://github.com/KavishGoyal/Blogathon
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.