sql >> Base de Datos >  >> RDS >> Sqlserver

La forma más eficiente de SELECCIONAR filas DONDE EXISTE la ID EN una segunda tabla

Resumen:

Ejecuté cada consulta 10 veces usando el siguiente conjunto de datos de prueba.

  1. Un conjunto de resultados de subconsulta muy grande (100000 filas)
  2. Filas duplicadas
  3. Filas nulas

Para todos los escenarios anteriores, tanto IN y EXISTS realizado de manera idéntica.

Alguna información sobre la base de datos de Performance V3 utilizado para probar.20000 clientes que tienen 1000000 pedidos, por lo que cada cliente se duplica aleatoriamente (en un rango de 10 a 100) en la tabla de pedidos.

Coste de ejecución,Tiempo:
A continuación se muestra una captura de pantalla de ambas consultas en ejecución. Observe el costo relativo de cada consulta.

Coste de memoria:
La concesión de memoria para las dos consultas también es la misma... Forcé MDOP 1 para no derramarlas en TEMPDB...

Tiempo de CPU, Lecturas:

Para Existe:

Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 1, logical reads 109, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Orders'. Scan count 1, logical reads 3855, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 469 ms,  elapsed time = 595 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

Para EN:

(20000 row(s) affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 1, logical reads 109, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Orders'. Scan count 1, logical reads 3855, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 547 ms,  elapsed time = 669 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

En cada caso, el optimizador es lo suficientemente inteligente como para reorganizar las consultas.

Tiendo a usar EXISTS solo que (mi opinión). Un caso de uso para usar EXISTS es cuando no desea devolver un segundo conjunto de resultados de tabla.

Actualización según consultas de Martin Smith:

Ejecuté las siguientes consultas para encontrar la forma más eficaz de obtener filas de la primera tabla para las que existe una referencia en la segunda tabla.

SELECT DISTINCT c.*
FROM Customers c
JOIN Orders o ON o.custid = c.custid   

SELECT c.*
FROM Customers c
INNER JOIN (SELECT DISTINCT custid FROM Orders) AS o ON o.custid = c.custid

SELECT *
FROM Customers C
WHERE EXISTS(SELECT 1 FROM Orders o WHERE o.custid = c.custid)

SELECT *
FROM Customers c
WHERE custid IN (SELECT custid FROM Orders)

Todas las consultas anteriores comparten el mismo costo con la excepción de la segunda INNER JOIN , Plan siendo el mismo para el resto.

Concesión de memoria:
Esta consulta

SELECT DISTINCT c.*
FROM Customers c
JOIN Orders o ON o.custid = c.custid 

concesión de memoria requerida de

Esta consulta

SELECT c.*
FROM Customers c
INNER JOIN (SELECT DISTINCT custid FROM Orders) AS o ON o.custid = c.custid 

concesión de memoria requerida de ..

Tiempo de CPU, Lecturas:
Para consulta:

SELECT DISTINCT c.*
FROM Customers c
JOIN Orders o ON o.custid = c.custid   

(20000 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 48, logical reads 1344, physical reads 96, read-ahead reads 1248, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Orders'. Scan count 5, logical reads 3929, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 5, logical reads 322, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 1453 ms,  elapsed time = 781 ms.

Para Consulta:

SELECT c.*
FROM Customers c
INNER JOIN (SELECT DISTINCT custid FROM Orders) AS o ON o.custid = c.custid

(20000 row(s) affected)
Table 'Customers'. Scan count 5, logical reads 322, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Orders'. Scan count 5, logical reads 3929, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 1499 ms,  elapsed time = 403 ms.