Connection (programming term)

honggarae 17/10/2021 1246

Inner join

Inner join (inner join) is a common "join" operation used in applications, it is generally Both are the default connection types. The inner join combines the columns of the two tables (such as A and B) based on the join predicate to produce a new result table. The query compares each row of table A with each row of table B and finds the combination that satisfies the join predicate. When the join predicate is satisfied, the matching rows in A and B will be combined by column (combined side by side) into a row in the result set. The result set generated by the connection can be defined as a Cartesian product (cross-connection) of the two tables first - each row in A and each row in B are combined, and then the records that satisfy the connection predicate are returned. In fact, SQL products will try to use other ways to realize the connection as much as possible, and the Cartesian product operation is very inefficient.

SQL defines two different syntax methods to represent "connection". The first is the "explicit connection symbol", which uses the keyword JOIN explicitly, and the second is the "implicit connection symbol", which uses the so-called "implicit connection symbol". The implicit connection symbol puts the tables that need to be connected in the FROM part of the SELECT statement, separated by commas. This constitutes a "cross-connect", the WHERE statement may place some filter predicates (filter conditions). Those filter predicates are functionally equivalent to explicit connection symbols. The SQL 89 standard only supports internal connections and cross connections, so only implicit connection is the expression; the SQL 92 standard adds support for external connections. JOIN expression.

Inner joins can be further divided into: equal joins, natural joins, and cross joins (see below).

The program should pay special attention to the columns of the join basis that may contain NULL values , NULL value does not match any value (even with itself) - unless predicates such as IS NULL or IS NOT NULL are explicitly used in the join condition.

For example, the following query passes the Employee table and the Department The shared attribute DepartmentID of the tables joins the two tables. Where the DepartmentID of the two tables match (if the join predicate is satisfied), the query will combine the LastName, DepartmentID and of the two tables >DepartmentName and other columns, put them in a row (a record) of the result table. When the DepartmentID does not match, no data will be generated in the result table.

Explicitly Connection example:

SELECT *FROM employee INNER JOIN department ON employee.DepartmentID=department.DepartmentID

Equivalent to:

SELECT* FROM, departmentE employee.DepartmentID = department.DepartmentID

The output result of explicit inner join:

tr>

Employee.LastName	Employee.DepartmentID< /th>	Department.DepartmentName	Department.DepartmentID
Robinson	34	Secretary	34
Jones	33	Engineering Department	33 td>
Smith	34	Secretary	34
Steinberg	33	Engineering Department	33
Rafferty	31	Sales Department	31

< b>Note: neither the employee "Jasper" nor the department "marketing" appears. They do not have any matching records in the expected table: "Jasper" has no associated department, and the department number 35 does not have any employees. In this way, there is no information about Jasper or the marketing department in the table after the "connection". Relative to the expected result, this behavior may be a subtle bug. External connections may be able to avoid this situation.

Equal link

Equal join (equi-join, or equijoin) is a comparative join (θ join ) is a special case, its connection predicate only uses equality comparison. Use other comparison operators (such as

SELECT *FROM employee INNER JOIN department ON employee.DepartmentID = department.DepartmentID

SQL provides an optional short notation to express equality connections , It uses the USING keyword (Feature ID F402):

SELECT *FROM employee JOIN department USING (DepartmentID)

The USING structure is not just syntactic sugar, the result of the above query It is different from the result of the query obtained by using an explicit predicate. In particular, the column listed in the USING section will appear only once in the temporary table of the connection result, and there is no table name to qualify the column name. In the above In the example, the temporary table of the join results in a separate column named DepartmentID instead of employee.DepartmentID or department.DepartmentID.

The USING statement is now used by MySQL, Oracle, PostgreSQL, SQLite, and DB2 /400 and other products.

Natural connection

Natural connection is further specialized than equal connection. When two tables are connected in natural connection, all the columns with the same name in the two tables will be To be compared, this is implicit. In the result table obtained by natural join, the column with the same name in the two tables only appears once.

The query instance used for inner join above can be expressed by natural join As follows:

SELECT *FROM employee NATURAL JOIN department

After using the USING statement, the DepartmentID column appears only once in the connection table, and there is no table name as a prefix:

< td>Smith< tr>

DepartmentID	Employee.LastName	Department.DepartmentName
34	Secretary
33	Jones	Engineering Department
34	Ro binson	Secretary
33	Steinberg	Engineering Department
31	Rafferty	Sales Department

When using JOIN USING or NATURAL JOIN in Oracle, if Add a table name as a prefix to the names of the columns shared by the two tables, and a compilation error will be reported: "ORA-25154: column part of USING clause cannot have qualifier" or "ORA-25155: column used in NATURAL join cannot have qualifier" ".

Cross join

Cross join(cross join), also known as Cartesian join(cartesian join) or cross product(Product), it is the basis of all types of inner joins. Regarding the table as a collection of row records, cross join returns the Cartesian product of these two collections. This is actually equivalent to the link condition of the inner link being "forever true", or the link condition does not exist.

If A and B are two sets, their cross-connection is recorded as: A × B.

Connection (programming term)

The SQL code used for cross join lists the table name in FROM, but does not contain any filtered join predicates.

Explicit cross join example:

< pre>SELECT *FROM employee CROSS JOIN department

Implicit cross connection example:

SELECT *FROM employee, department;

< td>31 td>< td>33

Employee.LastName	Employee.DepartmentID	Department.DepartmentName	Department.DepartmentID
Rafferty td>	31	Sales Department	31
Jones	33	Sales Department	31
Steinberg	33	Sales Department
Smith	34	Sales Department	31
Robinson	34	Sales Department	31
Jasper	NULL	Sales Department	31
Rafferty	31	Engineering Department	33
Jones	33	Engineering Department
Steinberg	33	Engineering Department	33
Smith	34	Engineering Department	33
Robinson	34	Engineering Department	33
Jasper	NULL td>	Engineering Department	33
Rafferty	31	Secretary	34
Jones	33	Secretary	34
Steinberg	33	Secretary	34
Smith	34	Secretary	34
Robinson	34	Secretary	34
Jasper	NULL	Secretary	34< /td>
Rafferty	31	Marketing Department	35
Jones	33	Marketing Department	35
Steinberg	33	Marketing Department	35
Smith	34	Market Department	35
Robinson	34	Marketing Department	35< /td>
Jasper	NULL	Marketing Department	35

Cross join will not apply any predicate to filter the records in the result table. Programmers can use the WHERE statement to further filter the result set.

Outer join

Outer join does not require that every record in the two tables connected has a matching record in the other table. The table that needs to keep all records (even if there is no matching record for this record) is called the retention table. Outer joins can be further divided into left outer joins, right outer joins and full joins according to the rows of the left table, right table or all tables in the join table.

(In this case left< /i><left> and right<right> represent the two sides of the JOIN keyword.)

In the standard SQL language, there is no implicit connection symbol for outer joins.

When an outer join contains both an ON clause and a WHERE clause, you should only write the join condition between tables in ON In the clause, the filtering of the data in the table must be written in the WHERE clause. The conditional expressions of internal connections can be placed in either the ON clause or the WHERE clause. This is because for external joins, rows in the reserved table that are filtered out by the ON clause must be added back. After this operation, the WHERE clause will be used to filter the rows in the connection result.

Left outer join

Left outer join(left outer join), also referred to as left outer join b>(left join), if the two tables A and B are left outer join, then the result table will contain all the records of the "left table" (ie table A), even if those records are in the "right" The table "B does not match the join condition. This means that even if the ON statement has 0 matches in B, the join operation will still return a record, but the value of each column from B in this record is NULL. This means that left outer join will return the combination of all records in the left table and matching records in the right table (if there is no matching record in the right table, the values of all columns from the right table are set to NULL). If a row of the left table has multiple matching rows in the right table, the rows of the left table will be copied as many as the matching rows of the right table, and combined to generate the join result.

For example, This allows us to find the department of an employee and display all employees, even if the employee does not have an associated department. (In the above internal connection part, there is an opposite example. Employees without an associated department number are not displayed in the results. ).

Left outer connection example: (The line added relative to the inner connection is marked in italics)

SELECT * FROM employee LEFT OUTER JOIN department ON employee.DepartmentID = department.DepartmentID

Employee.LastName	Employee.DepartmentID	Department.DepartmentName	Department.DepartmentID
Jones	33	Engineering	33
Rafferty	31	Sales	31
Robinson	34	Clerical	34
Smith	34	Clerical	34
Jasper	NULL	NULL	NULL
Steinberg	33	Engineering	33

Right outer join

Right outer join, also abbreviated as Right join, it is completely similar to the left outer join, except that the order of the joined tables is reversed. If A is the right Connect to table B, then each row in "right table" B will appear at least once in the connection table. If the record of table B does not find a matching row in "left table" A, the value of the column from A in the connection table Set to NULL.

The right join operation returns all rows of the right table and the rows that match these rows in the left table (there is no match, the column value from the left table is set to NULL).

For example, this allows us to display the department when looking for each employee and his department information, when there are no employees in this department.

Example of right connection: (relative The line added in the internal link is marked in italics)

SELECT* FROM employee RIGHT OUTER JOIN department ON employee.DepartmentID= department.DepartmentID

< td>34

Employee.LastName	Employee.DepartmentID	Department.DepartmentName	Department.DepartmentID
Smith	Clerical	34
Jones	33	Engineering	33
Robinson	34	Clerical	34
Steinberg	33	Engineering	33
Raff erty	31	Sales	31
NULL	NULL td>	Marketing	35

In fact, explicit right connection It is rarely used, because it can always be replaced with a left connection-just change the position of the table. In addition, the right connection does not have any additional functions compared to the left connection. The above table can also be obtained by using the left connection: < /p>

SELECT* FROM department LEFT OUTER JOIN employee ON employee.DepartmentID= department.DepartmentID

Fully connected

Fully connected is left and right outside The union of the connection. The connection table contains all the records of the connected table. If there are no matching records, it is filled with NULL.

For example, this allows us to view every employee and every employee in the department Departments with employees, at the same time, you can also see employees who are not in any department and departments without any employees.

Fully connected example:

SELECT* FROM employee FULL OUTER JOIN department ON employee.DepartmentID = department.DepartmentID

Employee.LastName	Employee.DepartmentID	Department.DepartmentName	Department.DepartmentID
Smith	34	Clerical	34
Jones	33	Engineering	33
Robinson	34	Clerical	34
Jasper	NULL	NULL	NULL
Steinberg	33	Engineering< /td>	33
Rafferty	31	Sales	31
NULL	NULL	Marketing	35< /td>

Some database systems (such as MySQL) do not directly support full connections, but they can be simulated by the union of left and right outer connections (reference: union). And above, etc. examples of monovalent:p>

 SELECT * FROM employee LEFT JOIN department ON employee.DepartmentID = department.DepartmentIDUNIONSELECT * FROM employee RIGHT JOIN department ON employee.DepartmentID = department.DepartmentIDWHERE employee.DepartmentID iS NULLpre> < p>SQLite does not support right connections, and all external connections can be simulated as follows:SELECT employee.*, department.*FROM employee JOIN department ON employee.DepartmentID=department.Department.Department.Department. department.*FROM department LEFT JOIN employee ON employee.DepartmentID = department.DepartmentIDWHERE employee.DepartmentID IS NULL
 Self-connection
 The self-connection example is with itself A good explanation.
Example
Build a query that tries to find records like this: Each record contains two employees, they are from The same country. If you have two employee tables (Employee), then as long as the employees in the first table and the employees in the second table are in the same country, you can use a normal join (equal join) operation Go get this table. However, all the employee information here is in a single big table.
The following modified employee table Employee:

tr>

Employee table( Employee)
EmployeeID	LastName	Country	DepartmentID
123	Rafferty	Australia	31
124	Jones< /td>	Australia	33
145	Steinberg	Australia	33
201	Robinson	United States	34
305	Smith	United Kingdom	34
306< /td>	Jasper	United Kingdom	NULL

The query of the sample solution can be written as follows: p>

SELECT F.EmployeeID, F.LastName, S.EmployeeID, S.LastName, F.CountryFROM Employee F, Employee SWHERE F.Country = S.CountryAND F.EmployeeIDAfter it is executed, it will generate the following Table:

< td>Jones

Self-joined employee table (Employee) through Country
EmployeeID	LastName	EmployeeID	LastName	Country
123	Rafferty	124	Jones	Australia
123 td>	Rafferty	145	Steinberg	Australia
124	145	Steinberg	Australia
305	Smith	306	Jasper	United Kingdom

About this example, please note:

F and S are the aliases of the first and second copies of the employee table (employee)
The condition F.Country = S .Country excludes the combination of employees in different countries. This example only expects to get the combination of employees in the same country.
The condition F.EmployeeID < S.EmployeeID excludes Employee ID is the same combination.
F.EmployeeID < S.EmployeeID excludes duplicate combinations. Without this condition, it will generate useless similar to the following table Data (only United Kingdom as an example)

EmployeeID	LastName	EmployeeID	LastName	Country
305	Smith	305	Smith	United Kingdom
305	Smith	306	Jasp er	United Kingdom
306	Jasper	305	Smith< /td>	United Kingdom
306	Jasper	306	Jasper	United Kingdom

Only two lines satisfy the requirements of the initial question. The first and last items are useless for this example.

Alternatives

The results of outer join queries can also be obtained through associated subqueries. For example

SELECT employee.LastName, employee.DepartmentID, department.DepartmentName FROM employee LEFT OUTER JOIN department ON employee.DepartmentID=department.DepartmentID

It can also be written as follows:

SELECT employee.LastName, employee.DepartmentName, department department. WHERE employee.DepartmentID = department.DepartmentID) FROM employee

Join algorithm

To perform a join operation, there are three basic algorithms.

Nested loop( LOOP JOIN)

Similar to the double loop in C language programming. The table that is scanned row by row as the outer loop is called the external input table; for each row of the external input table, another table that needs to be scanned and checked for matching is called the internal input table (equivalent to the inner loop). It is suitable for the situation where the number of rows in the external input table is small and the internal input table has created an index.

MERGE JOIN

Similar to the merging of two ordered arrays. Both input tables are sorted on the merged column; then the two tables are joined or discarded row by row in order. If the index is built in advance, the computational complexity of the merge connection is linear.

HASH JOIN

Suitable for intermediate results of queries, usually temporary tables without indexes; and when the number of rows of intermediate results is large. Hash concatenation selects the input table with a smaller number of rows as the generation input, applies a hash function to the values of the concatenated columns, and puts its rows (the storage location) into the hash bucket.

Latest: Linker

Next: 3 (Internet language)

program connection