How to tune SQL Queries for better performance?
Performance tuning of SQL
Queries:
Note: One of our visitors, Amit
asked this question by posting a comment. Thanks Amit for your contribution.
Keep visiting/posting!
Performance of the SQL queries of
an application often play a big role in the overall performance of the
underlying application. The response time may at times be really irritating for
the end users if the application doesn't have fine-tuned SQL queries. There
are several ways of tuning SQL statements, few of which are:-
Understanding of the Data,
Business, and Application - it's almost impossible to fine-tune the SQL
statements without having a proper understanding of the data managed by the
application and the business handled by the application. The understanding of
the application is of course of utmost importance. By knowing these things
better, we may identify several instances where the data retrieval/modification
by many SQL queries can simply be avoided as the same data might be available
somewhere else, may be in the session of some other integrating application, and
we can simply use that data in such cases. The better understanding will help
you identify the queries which could be written better either by changing the
tables involved or by establishing relationships among available
tables.
Using realistic test data -
if the application is not being tested in the development/testing environments
with the volume and type of data, which the application will eventually face in
the production environment, then we can't be very sure about how the SQL queries
of the application will really perform in actual business scenarios. Therefore,
it's important to have the realistic data for development/testing purposes as
well.
Using Bind Variables, Stored
Procs, and Packages - Using identical SQL statements (of course wherever
applicable) will greatly improve the performance as the parsing step will get
eliminated in such cases. So, we should use bind variables, stored procedures,
and packages wherever possible to re-use the same parsed SQL
statements.
Using the indexes carefully
- Having indexes on columns is the most common method of enhancing performance,
but having too many of them may degrade the performance as well. So, it's very
critical to decide wisely about which all columns of a table we should create
indexes on. Few common guidelines are:- creating indexes on the columns which
are frequently used either in WHERE clause or to join tables, avoid creating
indexes on columns which are used only by functions or operators, avoid creating
indexes on the columns which are required to changed quite frequently,
etc.
Making available the access
path - the optimizer will not use an access path that uses an index only
because we have created that index. We need to explicitly make that access path
available to the optimizer. We may use SQL hints to do that.
Using EXPLAIN PLAN and
TKPROF - these tools can be used to fine tune SQL queries to a great extent.
EXPLAIN PLAN explains the complete access path which will be used by the
particular SQL statement during execution and the second tool TKPROF displays
the actual performance statistics. Both these tools in combination can be really
useful to see, change, and in turn fine-tune the SQL statements.
Optimizing the WHERE clause
- there are many cases where index access path of a column of the WHERE clause
is not used even if the index on that column has already been created. Avoid
such cases to make best use of the indexes, which will ultimately improve the
performance. Some of these cases are: COLUMN_NAME IS NOT NULL (ROWID for a null
is not stored by an index), COLUMN_NAME NOT IN (value1, value2, value3, ...),
COLUMN_NAME != expression, COLUMN_NAME LIKE'%pattern' (whereas COLUMN_NAME LIKE
'pattern%' uses the index access path), etc. Usage of expressions or functions
on indexed columns will prevent the index access path to be used. So, use them
wisely!
Using WHERE instead of
HAVING - usage of WHERE clause may take advantage of the index defined on
the column(s) used in the WHERE clause.
Using the leading index columns
in WHERE clause - the WHERE clause may use the complex index access path in
case we specify the leading index column(s) of a complex index otherwise the
WHERE clause won't use the indexed access path.
Indexed Scan vs Full Table
Scan - Indexed scan is faster only if we are selcting only a few rows of a
table otherwise full table scan should be preferred. It's estimated that an
indexed scan is slower than a full table scan if the SQL statement is selecting
more than 15% of the rows of the table. So, in all such cases use the SQL hints
to force full table scan and suppress the use of pre-defined indexes. Okay...
any guesses why full table scan is faster when a large percentage of rows are
accessed? Because an indexed scan causes multiple reads per row accessed whereas
a full table scan can read all rows contained in a block in a single logical
read operation.
Using ORDER BY for an indexed
scan - the optimizer uses the indexed scan if the column specified in the
ORDER BY clause has an index defined on it. It'll use indexed scan even if the
WHERE doesn't contain that column (or even if the WHERE clause itself is
missing). So, analyze if you really want an indexed scan or a full table scan
and if the latter is preferred in a particular scenario then use 'FULL' SQL hint
to force the full table scan.
Minimizing table passes -
it normally results in a better performance for obvious reasons.
Joining tables in the proper
order - the order in which tables are joined normally affects the number of
rows processed by that JOIN operation and hence proper ordering of tables in a
JOIN operation may result in the processing of fewer rows, which will in turn
improve the performance. The key to decide the proper order is to have the most
restrictive filtering condition in the early phases of a multiple table JOIN.
For example, in case we are using a master table and a details table then it's
better to connect to the master table first to connecting to the details table
first may result in more number of rows getting joined.
Simple is usually faster -
yeah... instead of writing a very complex SQL statement, if we break it into
multiple simple SQL statements then the chances are quite high that the
performance will improve. Make use of the EXPLAIN PLAN and TKPROF tools to
analyze both the conditions and stick to the complex SQL only if you're very
sure about its performance.
Using ROWID and ROWNUM wherever
possible - these special columns can be used to improve the performance of
many SQL queries. The ROWID search is the fastest for Oracle database and this
luxury must be enjoyed wherever possible. ROWNUM comes really handy in the cases
where we want to limit the number of rows returned.
Usage of explicit cursors is
better - explicit cursors perform better as the implicit cursors result in
an extra fetch operation. Implicit cursosrs are opened the Oracle Server for
INSERT, UPDATE, DELETE, and SELECT statements whereas the explicit cursors are
opened by the writers of the query by explicitly using DECLARE, OPEN, FETCH, and
CLOSE statements.
Reducing network traffic -
Arrays and PL/SQL blocks can be used effectively to reduce the network traffic
especially in the scenarios where a huge amount of data requires processing. For
example, a single INSERT statement can insert thousands of rows if arrays are
used. This will obviously result into fewer DB passes and it'll in turn improve
performance by reducing the network traffic. Similarly, if we can club multiple
SQL statements in a single PL/SQL block then the entire block can be sent to
Oracle Server involving a single network communication only, which will
eventually improve performance by reducing the network traffic.
Using Oracle parallel query
option - Since Oracle 8, even the queries based on indexed range scans can
use this parallel query option if the index is partitioned. This feature can
result in an improved performance in certain scenarios.
No comments:
Post a Comment