Sunday, May 29, 2011

Self Join and when do you use it

Self Join and when do you use it

Use the script below to create Employee Table and populate it with some sample data. We will be using Employee Table to understand Self Join.

CREATE TABLE EMPLOYEE (
[EMP_ID] INT PRIMARY KEY,
[NAME] NVARCHAR(50),
[RO_ID] INT
)
GO
INSERT INTO EMPLOYEE VALUES(101,'Amol',102)
INSERT INTO EMPLOYEE VALUES(102,'Amit',104)
INSERT INTO EMPLOYEE VALUES(103,'Swati',102)
INSERT INTO EMPLOYEE VALUES(104,'Chetan',103)
INSERT INTO EMPLOYEE VALUES(105,'Rakesh',NULL)
INSERT INTO EMPLOYEE VALUES(106,'Sree',103)
GO

We use Self Join, when we have a table that references itself. For example, In the Employee Table under RO_ID column points to EMP_ID column. This is the right scenario where we can use Self Join. Now suppose you want to write a query that will give you the list of all Employee Names and their respective Reporting Officer’s Names. In order to achieve this you can use Self Join.

Employee Table




The query below is an example of Self Join. Both E1 and E2 refer to the same Employee Table. In this query we are joining the Employee Table with itself.

SELECT E1.[NAME],E2.[NAME] AS [RO NAME]
FROM EMPLOYEE E1
INNER JOIN EMPLOYEE E2
ON E2.EMP_ID =E1.RO_ID

If we run the above query we only get 5 rows out of the 6 rows as shown in Results1 below.



Output 1

This is because Rakesh does not have a Manager. RO_ID column for Rakesh is NULL. If we want to get all the rows then we can use LEFT OUTER JOIN as shown below. For a list of all different types of JOINS in SQL Server, please


SELECT E1.[NAME],E2.[NAME] AS [RO NAME]
FROM EMPLOYEE E1
LEFT OUTER JOIN EMPLOYEE E2
ON E2.EMP_ID =E1.RO_ID
If we execute the above query we get all the rows, including the row that has a null value in the RO_ID column. The results are shown below. The RO NAME for 2nd record is NULL as Rakesh does not have a Manager.