FirebirdSQL logo
Explicit-condition joins

Most qualified joins have an ON clause, with an explicit condition that can be any valid Boolean expression, but usually involves a comparison between the two sources involved.

Often, the condition is an equality test (or a number of ANDed equality tests) using the “=” operator.Joins like these are called equi-joins.(The examples in the section on inner and outer joins were all equi-joins.)

Examples of joins with an explicit condition:

/* Select all Detroit customers who made a purchase
   in 2013, along with the purchase details: */
select * from customers c
  join sales s on s.cust_id = c.id
  where c.city = 'Detroit' and s.year = 2013;
/* Same as above, but include non-buying customers: */
select * from customers c
  left join sales s on s.cust_id = c.id
  where c.city = 'Detroit' and s.year = 2013;
/* For each man, select the women who are taller than he.
   Men for whom no such woman exists are not included. */
select m.fullname as man, f.fullname as woman
  from males m
  join females f on f.height > m.height;
/* Select all pupils with their class and mentor.
   Pupils without a mentor are also included.
   Pupils without a class are not included. */
select p.firstname, p.middlename, p.lastname,
       c.name, m.name
  from pupils p
  join classes c on c.id = p.class
  left join mentors m on m.id = p.mentor;
Named columns joins

Equi-joins often compare columns that have the same name in both tables.If this is the case, we can also use the second type of qualified join: the named columns join.

Note

Named columns joins are not supported in Dialect 1 databases.

Named columns joins have a USING clause which states only the column names.So instead of this:

select * from flotsam f
  join jetsam j
  on f.sea = j.sea
  and f.ship = j.ship;

we can also write:

select * from flotsam
  join jetsam using (sea, ship)

which is considerably shorter.The result set is a little different though — at least when using “SELECT *”:

  • The explicit-condition join — with the ON clause — will contain each of the columns SEA and SHIP twice: once from table FLOTSAM, and once from table JETSAM.Obviously, they will have the same values.

  • The named columns join — with the USING clause — will contain these columns only once.

If you want all the columns in the result set of the named columns join, set up your query like this:

select f.*, j.*
  from flotsam f
  join jetsam j using (sea, ship);

This will give you the same result set as the explicit-condition join.

For an OUTER named columns join, there’s an additional twist when using “SELECT *” or an unqualified column name from the USING list:

If a row from one source set doesn’t have a match in the other but must still be included because of the LEFT, RIGHT or FULL directive, the merged column in the joined set gets the non-NULL value.That is fair enough, but now you can’t tell whether this value came from the left set, the right set, or both.This can be especially deceiving when the value came from the right hand set, because “*” always shows combined columns in the left hand part — even in the case of a RIGHT join.

Whether this is a problem or not depends on the situation.If it is, use the “a.*, b.*” approach shown above, with a and b the names or aliases of the two sources.Or better yet, avoid “*” altogether in your serious queries and qualify all column names in joined sets.This has the additional benefit that it forces you to think about which data you want to retrieve and where from.

It is your responsibility to make sure the column names in the USING list are of compatible types between the two sources.If the types are compatible but not equal, the engine converts them to the type with the broadest range of values before comparing the values.This will also be the data type of the merged column that shows up in the result set if “SELECT *” or the unqualified column name is used.Qualified columns on the other hand will always retain their original data type.

Tip

If, when joining by named columns, you are using a join column in the WHERE clause, always use the qualified column name, otherwise an index on this column will not be used.

SELECT 1 FROM t1 a JOIN t2 b USING (x) WHERE x = 0;

-- PLAN JOIN (A NATURAL , B INDEX (RDB$2))

However:

SELECT 1 FROM t1 a JOIN t2 b USING (x) WHERE a.x = 0;
-- PLAN JOIN (A INDEX (RDB$1), B INDEX (RDB$2))

SELECT 1 FROM t1 a JOIN t2 b USING (x) WHERE b.x = 0;
-- PLAN JOIN (A INDEX (RDB$1), B INDEX (RDB$2))

The fact is, the unspecified column in this case is implicitly replaced by `COALESCE(a.x, b.x).This trick is used to disambiguate column names, but it also interferes with the use of the index.