SQL Interview Questions
These questions deal with SQL Server and T-SQL operations, specifically.
- What is SQL?
SQL is an acronym for Structured Query Language.
- What is Normalization, and explain the different levels.
Normalization is a method for reducing redundancy in a database. The idea is that data is grouped together in such a manner to enhance a given performance goal.
1NF- Removes any duplicated attributes.
2NF - Should be 1NF and each non-key should be dependent on the primary key.
3NF - Should be 2NF and all non-key attributes that are not dependent on the primary key should be removed.
- What is Denormalization and when should it be applied?
Denormalization is the reverse of normalization. It increases query performance by reducing the number of joins by consolidating data into fewer tables.
- What is the difference between an Heap table and a Clustered Table?
HEAP table: A table in which the data is not stored in any particular order. There's no clustered index.
CLUSTERED Table: A table with a predefined clustered index on at least one column defining the storing order of the rows within the data pages, based on a clustered index key.
- How many clustered indexes can be created on a table?
One only. This because a table can be sorted in the table itself only once.
- What is the difference between a clustered and an non-clustered index?
A table can have multiple non-clustered indexes, but only one clustered index.
- What are the database relationship types?
One-To-One - For each instance in the first entity there is one and only one matching instance in the second entity, and vice versa.|
One-To-Many - For each instance in the first entity, there can be more than one matching instance in the second entity. The second entity will have one and only one matching instance in in the first entity.
Many-To_Many - For each instance in the first entity, there can be one or more matching instances in the second entity, and vice versa.
- What is the difference between a Primary and a Unique Key?
A Primary Key denotes a default clustered index, and does not allow nulls.
A Unique Key denotes a non-clustered index, and allows one null.
- What is a Query Plan?
A query plan is a physical breakdown of the code passed to the SQL Server optimizer.
- What are Candidate, Alternate and Composite Keys?
Candidate Key - a key that can uniquely identify a row in a table
Alternate Key - If the table has more than one candidate keys and one becomes a primary key, the rest become alternate keys.
Composite Key - More than one key to uniquely define a table row.
- What is a Transaction?
A Transaction is a work wrapper that creates a reversible work instance as long as the transaction has not been committed. Transactions conform to the ACID principle.
- What is ACID?
ACID is the acronym for Atomicy, Consistency, Isolation and Durability:
Atomicy: each transaction is to be 'all or nothing'. If one part of the transaction fails, the entire transaction fails.
Consistency: Any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to the defined rules and structures of that database.
Isolation: Ensure that the concurrent execution of transactions would be the same as if they were entered serially. Transactions are independent of each other.
Durability: Once a transaction has been committed, it will remain committed, even after a system failure, environmental / hardware crashes, or errors.
- What are the primary steps in data modeling?
Logical - Planning, Analysis and Design
Physical - Design,Implementation and Maintenance
- What is a cursor?
A cursor is a mechanism that allows the developer to go throw a data result set row by row.
- What is a SPID?
A SPID is a Service Process ID. It is simply a SQL Server 'session' created anytime an application connects to SQL Server.
- What are the different types of cursors?
Static cursors: populates the result set at the time of cursor creation and the query result is cached. Can move forward and backward (scrollable). Most memory intensive of the cursor types.
No UPDATE, INSERT, or DELETE operations are reflected in a static cursor (unless the cursor is closed and reopened).
SQL Server static cursors are always read-only.
Dynamic Cursors: Allows the developer to see updates, deletes and inserts while the cursor is open. So the cursor will be sensitive to data changes.
Forward-only Cursors:The fastest of all the cursors. It does not allow for backward scrolling like static and dynamic cursors will.
- What is a SQL Server Deadlock?
A deadlock is when 2 SPIDs have data locked, and neither process can release their locks until one of the SPIDs releases.
- What does the COALESCE() function do?
The COALESCE function returns the first non-null value from a set of values provided to it. It acts in the same capacity as a CASE statement.
* Fun note - in order to function properly, the values submitted to the function must be of the same type (much like in a CASE function.)
- Why should you avoid using cursors in general?
A Cursor impacts the performance of the SQL Server since it uses the SQL Server instances' memory, reduce concurrency, decrease network bandwidth and lock resources.
- What is an execution plan, and how do you use it?
An execution plan is the graphical road map of the SQL Server process. It gives the developer an opportunity to analyze the performance characteristics of the query at hand.
- What is NOLOCK?
NOLOCK allows SQL to read from data tables that would normally be blocked by another query or process. It can improve performance, but can result in dirty reads.
This is only applicable with SELECT statements, as CRUD processes will result in an error.
- What is a 'Dirty Read'?
A dirty read is a query that has run against a table where an active transaction is set. This means that the data returned in the query could be 'out of date' from the process holding the transaction open.
- What is a Transaction Isolation Level?
This is how SQL Server controls the locking and row versioning of Transactions.
READ COMMITTED: Stipulates that statements cannot read data that has been modified but not committed by other transactions.
READ UNCOMMITTED: Stipulates that statements can read rows that have been modified, even if the row hasn't been committed by other transactions.
REPEATABLE READ: Specifies that statements cannot read data that has been modified but not yet committed by other transactions, and no other transactions can modify data that has been read by the current transaction until the current transaction completes.
SNAPSHOT: Specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction.
SERIALIZABLE: Statements cannot read uncommitted data, no other transactions can modify the data until the current transaction completes, and other transactions cannot insert new rows with key values within the range of keys managed by the current transaction.
- What are the different types of locks available in SQL Server?
Shared: used for operations that don't change data (like SELECT statements)
Update: used when updating resources
Exclusive: Used for CRUD processes
Schema: Used when an operation dependent on the schema is a table is executing.
Bulk Update: Used when bulk copying data into a table and the TABLOCK hint is specified.
Intent (shared, update, exclusive): Used to establish a lock hierarchy.
Key Range (shared, insert, exclusive): Protects a defined range of rows read by a query when using the SERIALIZABLE transaction isolation level.
- What is Lock Isolation?
It's the process of of converting may fine-grained locks into fewer, coarse-grained locks.
- Define Index Depth, Density and Sensitivity.
Index Depth: the number of levels from the index root node to the leaf nodes. If the depth is too deep, there will be performance degradation.
Index Density: the measure if the lack of uniqueness of the data in a table. A dense column is one that has a high number of duplicates.
Index selectivity: A measure of how many rows scanned compared to the total number of rows. High selectivity means there is a small number of rows scanned when related to total number of rows.
- What is the difference between an OLAP and a OLTP database?
OLAP- Online Analytic Processing: Databases are used for transactional systems. Most of the submitted queries are for data modification.
OLTP- Online Transaction Processing: Databases used for data warehousing systems. Most queries are data retrievals that filter, group, aggregate and join larger datasets very quickly.
- Name some tools / methods for monitoring SQL performance.
Third Party Applications
- How do you collect IO and time statistics for queries?
SET STATISTICS IO on: provides a textual count of table reads.
SET STATISTICS TIME on: provides compile and execution times.