SQL (Structural / Advanced) MCQs — GATE level

Each question: 4 options (A–D). Correct answer shown with a short, clear solution.

1

Table Orders(Oid, CustId, Price, Qty); find total revenue per customer, exclude customers whose total revenue < 1000:
SELECT CustId FROM Orders GROUP BY CustId HAVING SUM(Price*Qty) >= 1000;
Which is true?
A. SQL correct; returns customers with revenue ≥ 1000
B. Need WHERE instead of HAVING
C. Need SUM(Price)+SUM(Qty)
D. Use COUNT instead of SUM
✅ Answer: A
Solution: HAVING filters groups after aggregation; SUM(Price*Qty) computes revenue per row aggregated correctly.

2

Employees(Eid, Salary, Dept); give median salary per Dept using standard SQL (no MEDIAN). Which approach is correct?
A. Use PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Salary) OVER (PARTITION BY Dept)
B. AVG(Salary) with GROUP BY Dept
C. MAX(Salary) with GROUP BY Dept
D. SUM(Salary)/COUNT(*) with GROUP BY Dept
✅ Answer: A
Solution: PERCENTILE_CONT in ANSI/DB-specific window functions computes median per partition.

3

R(A,B) contains duplicates. Which expression removes duplicates and returns A values present in every distinct B in table S(B)?
A. SELECT DISTINCT A FROM R WHERE B IN (SELECT B FROM S)
B. Use R ÷ S logic: SELECT A FROM R GROUP BY A HAVING COUNT(DISTINCT B) = (SELECT COUNT(DISTINCT B) FROM S)
C. SELECT A FROM R INTERSECT S
D. SELECT DISTINCT A FROM R
✅ Answer: B
Solution: Division-like query uses count distinct equals count of S values.

4

You want top-3 salaries per Dept in Oracle SQL. Which is correct?
A. SELECT * FROM (SELECT Eid, Salary, Dept, ROW_NUMBER() OVER (PARTITION BY Dept ORDER BY Salary DESC) rn FROM Emp) WHERE rn <= 3
B. SELECT TOP 3 * FROM Emp ORDER BY Salary DESC
C. GROUP BY Dept HAVING MAX(Salary) <= 3
D. LIMIT 3 PARTITION BY Dept
✅ Answer: A
Solution: Use ROW_NUMBER partitioned by Dept and filter rn ≤ 3.

5

Products(Pid, Name, Tags JSON) Postgres: find products with tag 'eco' in JSON array Tags. Which WHERE is correct?
A. WHERE Tags::jsonb ? 'eco'
B. WHERE Tags LIKE '%eco%'
C. WHERE Tags = 'eco'
D. WHERE JSON_EXTRACT(Tags, '$') = 'eco'
✅ Answer: A
Solution: In Postgres jsonb, ? tests top-level key/element presence; for JSON array of strings stored as jsonb, ? 'eco' is correct.

6

Customers(Cid, Name, City) and Orders(Oid, Cid, OrderDate). Return customers who made no orders in 2024. Efficient approach?
A. SELECT * FROM Customers c WHERE NOT EXISTS (SELECT 1 FROM Orders o WHERE o.Cid=c.Cid AND o.OrderDate BETWEEN '2024-01-01' AND '2024-12-31')
B. LEFT JOIN and filter WHERE o.Cid IS NULL after joining with 2024 predicate on Orders in ON clause
C. Use WHERE c.Cid NOT IN (SELECT Cid FROM Orders WHERE YEAR(OrderDate)=2024) — careful with NULLs
D. All of the above, but C needs attention for NULLs
✅ Answer: D
Solution: A is safe; B also valid if the ON clause contains date filter; C might fail if subquery returns NULL — need WHERE Cid IS NOT NULL or use NOT EXISTS.

7

A(a,b) with millions of rows, index on b. Query: SELECT * FROM A WHERE b IN (10,20,30) — which is likely fastest?
A. Use index on b; single query as shown will use index and do three seeks.
B. Break into three separate queries unioned — always faster.
C. Full table scan always faster.
D. Use correlated subquery for each value.
✅ Answer: A
Solution: IN with small constant list uses index seeks; unioning adds overhead.

8

Which statement about MERGE is true?
A. MERGE can do INSERT, UPDATE, DELETE in single statement based on match conditions.
B. MERGE only inserts.
C. MERGE is not ANSI SQL.
D. MERGE requires triggers.
✅ Answer: A
Solution: ANSI MERGE allows WHEN MATCHED/NOT MATCHED clauses for update/insert/delete.

9

Orders(Oid, Cust, Amount) want running total ordered by Oid per Cust. Choose SQL using window functions:
A. SUM(Amount) OVER (PARTITION BY Cust ORDER BY Oid ROWS UNBOUNDED PRECEDING)
B. SUM(Amount) aggregate without GROUP BY
C. Correlated subquery SELECT (SELECT SUM(Amount) FROM Orders o2 WHERE o2.Cust=o1.Cust AND o2.Oid<=o1.Oid)
D. Both A and C valid; A is more efficient typically.
✅ Answer: D
Solution: Both produce running total; window function is usually more efficient.

10

Which isolation level can cause phantom reads but not dirty reads?
A. Read Committed
B. Read Uncommitted
C. Serializable
D. Repeatable Read
✅ Answer: A
Solution: Read Committed prevents dirty reads but allows non-repeatable reads and phantoms.

11

Given Employees(Eid, ManagerId), compute the transitive closure (all managers up chain) — which pure SQL facility is required?
A. Recursive CTE (WITH RECURSIVE)
B. Grouping sets
C. Window functions
D. PIVOT
✅ Answer: A
Solution: Recursive CTE implements hierarchical traversal.

12

Products(pid, price); update price by 10% for all products cheaper than average price. Which statement is correct?
A. UPDATE Products SET price = price * 1.10 WHERE price < (SELECT AVG(price) FROM Products)
B. UPDATE Products SET price = price * 1.10 WHERE price IN (SELECT price FROM Products WHERE price < AVG(price))
C. Need cursor to update per-row
D. Use MERGE only
✅ Answer: A
Solution: Subquery computes AVG once; predicate compares per row.

13

SELECT DISTINCT a FROM T with no index on a. Which approach can be faster?
A. Use GROUP BY a possibly with hashed aggregation.
B. Use ORDER BY a LIMIT 1
C. Use nested loops with subqueries
D. SELECT a FROM T WHERE ROWNUM=1
✅ Answer: A
Solution: GROUP BY allows hashing to deduplicate instead of sorting, depending on optimizer.

14

Emp(Eid, Salary) find second highest salary (distinct) in standard SQL:
A. SELECT MAX(Salary) FROM Emp WHERE Salary < (SELECT MAX(Salary) FROM Emp)
B. SELECT Salary FROM Emp ORDER BY Salary DESC LIMIT 1 OFFSET 1
C. SELECT DISTINCT Salary FROM Emp ORDER BY Salary DESC LIMIT 1 OFFSET 1
D. All above depending on SQL dialect; C is portable if LIMIT/OFFSET supported with DISTINCT.
✅ Answer: D
Solution: Approach depends on dialect; subquery version (A) is ANSI compatible; LIMIT/OFFSET is dialect-specific; DISTINCT ensures unique.

15

Table T(a INT PRIMARY KEY, b INT UNIQUE) attempt INSERT INTO T(a,b) VALUES(1,NULL) — which is allowed?
A. Allowed; UNIQUE allows NULLs (depends on DB)
B. Not allowed; UNIQUE forbids NULLs always
C. Allowed if multiple nulls allowed by DB (e.g., PostgreSQL)
D. Only if b has default
✅ Answer: C
Solution: SQL standard allows multiple NULLs for UNIQUE in many DBs (Postgres allows); some DBs (e.g., MS SQL Server) treat NULLs as not equal allowing one NULL, but generally multiple NULLs allowed.

16

You need to remove duplicate rows keeping the smallest created_at. Table T(id, val, created_at), duplicates defined by val. Which delete works (ANSI-ish)?
A. DELETE FROM T WHERE created_at > (SELECT MIN(created_at) FROM T t2 WHERE t2.val = T.val)
B. DELETE FROM T WHERE id NOT IN (SELECT MIN(id) FROM T GROUP BY val)
C. WITH cte AS (SELECT id, ROW_NUMBER() OVER (PARTITION BY val ORDER BY created_at) rn FROM T) DELETE FROM T WHERE id IN (SELECT id FROM cte WHERE rn>1) (dialect-dependent)
D. All of the above (A may not work with duplicates in subquery correlations in some DBs), C is safest in modern DBs.
✅ Answer: D
Solution: C using ROW_NUMBER is standard in many dialects; others may work but depend on engine.

17

SELECT SUM(col) FROM t; returns NULL if table empty?
A. Yes, aggregates return NULL for empty input except COUNT.
B. No, returns 0.
C. Depends on DB
D. Only SUM(DISTINCT) returns NULL
✅ Answer: A
Solution: SUM over empty set returns NULL; COUNT returns 0.

18

Employees(Eid, Salary, Bonus) calculate tax = 30% of Salary+COALESCE(Bonus,0). Which expression valid?
A. SELECT Eid, 0.3*(Salary + COALESCE(Bonus,0)) AS Tax FROM Employees
B. SELECT Eid, 0.3*(Salary + Bonus) FROM Employees
C. SELECT Eid, ROUND(0.3*(Salary+Bonus)) FROM Employees
D. SELECT Eid, 30%*(Salary+Bonus) FROM Employees
✅ Answer: A
Solution: COALESCE handles NULL bonus; percentage must be numeric expression.

19

Orders(Oid, Qty, UnitPrice) want total revenue as integer rounded to nearest rupee. Which SQL is best?
A. SELECT ROUND(SUM(Qty*UnitPrice)) FROM Orders
B. SELECT CAST(SUM(Qty*UnitPrice) AS INT) FROM Orders
C. SELECT SUM(CAST(Qty*UnitPrice AS INT)) FROM Orders
D. SELECT FLOOR(SUM(Qty*UnitPrice)) FROM Orders
✅ Answer: A
Solution: ROUND applied after whole-sum yields nearest integer; CAST truncates.

20

CREATE INDEX ix ON Orders(CustomerId, OrderDate) — which query benefits most?
A. WHERE CustomerId = ? AND OrderDate BETWEEN ? AND ?
B. WHERE OrderDate = ? only
C. ORDER BY CustomerId only
D. JOIN without using CustomerId
✅ Answer: A
Solution: Composite index on (CustomerId, OrderDate) is best for queries filtering on leading columns; range on second column works.

21

Which is true about EXPLAIN ANALYZE?
A. It displays actual runtime statistics and executes the query (in Postgres).
B. It only shows the planned cost but not actuals.
C. It rewrites query semantics.
D. It locks tables for execution.
✅ Answer: A
Solution: EXPLAIN ANALYZE runs the query and shows real times (Postgres), while EXPLAIN only shows plan.

22

SELECT * FROM A JOIN B ON A.x = B.x then WHERE A.y > 10 — can we push WHERE into JOIN?
A. Yes, if condition only references A it can be applied before join (selection pushdown).
B. No, it must be after join.
C. Only if B has no rows.
D. Only in outer joins.
✅ Answer: A
Solution: Selection referencing only A can be applied before join to reduce rows.

23

JSON column data contains {"score": 85}. Which selects this int in Postgres?
A. SELECT (data->>'score')::int FROM t;
B. SELECT data->'score' FROM t;
C. SELECT JSON_EXTRACT(data, '$.score')
D. SELECT data['score']
✅ Answer: A
Solution: ->> returns text; cast to int: ::int.

24

Which is correct way to add a NOT NULL column with default in big table without long exclusive lock (Postgres 11+)?
A. ALTER TABLE ... ADD COLUMN newcol integer DEFAULT 0; (causes table rewrite pre-11)
B. ALTER TABLE ... ADD COLUMN newcol integer; UPDATE SET newcol=0; ALTER TABLE ... ALTER COLUMN newcol SET NOT NULL; (two-step avoids rewrite)
C. Add with default and NOT NULL in one step always safe
D. Create new table and copy — only option
✅ Answer: B
Solution: Two-step avoids rewrite for large tables; Postgres optimizations later added however.

25

Which is true about functional indexes?
A. They index expression results (e.g., lower(name)).
B. They index functions only for aggregate functions.
C. They cannot be used in WHERE clauses.
D. They auto-update only on full table rebuild.
✅ Answer: A
Solution: Functional/index-on-expression is supported in many DBs and used in WHERE.

26

SELECT COALESCE(MIN(x), 999) FROM T; returns 999 if T empty. True?
A. True
B. False
C. Returns NULL
D. Returns error
✅ Answer: A
Solution: MIN(empty) = NULL; COALESCE returns default 999.

27

Which SQL will remove rows in Log older than one month efficiently when table is huge?
A. DELETE FROM Log WHERE created_at < now() - interval '30 days' (with index on created_at and batched deletes)
B. TRUNCATE if logs are partitioned then DROP partition
C. DELETE single huge transaction — best always
D. Both A (batched) and B (partition-drop) are efficient depending on schema; B best if partitioned.
✅ Answer: D
Solution: Partitioning + dropping partitions is cheapest; batched deletes with index acceptable otherwise.

28

Which statement about NULL comparisons is true?
A. NULL = NULL is true
B. NULL IS NULL is true; NULL = NULL is unknown/false
C. COALESCE(NULL,NULL) = NULL
D. NULL sorts as smallest in all DBs
✅ Answer: B
Solution: SQL uses three-valued logic; equality of NULLs is unknown; IS NULL checks nullity.

29

Employees(Eid, Salary) want percentage of employees earning above average. Which is correct?
A. SELECT 100.0 * SUM(CASE WHEN Salary > avg_salary THEN 1 ELSE 0 END)/COUNT(*) FROM (SELECT Salary, AVG(Salary) OVER () avg_salary FROM Employees) t
B. SELECT 100 * (SELECT COUNT(*) FROM Employees WHERE Salary > (SELECT AVG(Salary) FROM Employees))/ (SELECT COUNT(*) FROM Employees)
C. Both A and B valid; B uses subqueries, A uses window functions.
D. None
✅ Answer: C
Solution: Both queries yield same percentage; A uses single scan.

30

CREATE VIEW v AS SELECT ... which is true about updatable views?
A. View updatable when mapping to single base table and no aggregates/ GROUP BY/ DISTINCT/ joins etc.
B. Any view is updatable.
C. Views with aggregates are updatable.
D. Indexes on view matter for updatability.
✅ Answer: A
Solution: Updatable view restrictions require simple one-table select mapping.

31

Which SQL computes pairwise intersection of two sets of integers S1 and S2 stored in tables?
A. SELECT val FROM S1 INTERSECT SELECT val FROM S2
B. SELECT val FROM S1 WHERE val IN (SELECT val FROM S2)
C. SELECT DISTINCT S1.val FROM S1 JOIN S2 ON S1.val=S2.val
D. All above equivalent semantically (depending on duplicates and null handling).
✅ Answer: D
Solution: All return intersection; INTERSECT removes duplicates; joins and IN may produce duplicates unless DISTINCT used.

32

Employees(Eid,PJoinDate) you want every employee’s tenure in months rounded down. Which expression is portable?
A. EXTRACT(YEAR FROM age(current_date, PJoinDate))*12 + EXTRACT(MONTH FROM age(current_date, PJoinDate)) (Postgres)
B. DATEDIFF(month, PJoinDate, GETDATE()) (SQL Server)
C. Both A and B are dialect-specific; no single ANSI.
D. MONTHS_BETWEEN(SYSDATE, PJoinDate) (Oracle)
✅ Answer: C
Solution: Different DBs use different date functions; no single portable expression.

33

Which is correct for enforcing a business rule “no overlapping appointments per room”? Table Appt(RoomId, Start, End). Choose constraint/solution:
A. Use exclusion constraint (Postgres): EXCLUDE USING GIST (RoomId WITH =, tsrange(Start,End) WITH &&)
B. Use trigger to check complicated overlap logic on insert/update
C. Unique constraint on (RoomId, Start, End) — not sufficient
D. A and B feasible; C insufficient
✅ Answer: D
Solution: Exclusion constraints or triggers enforce no overlap; unique constraint won’t.

34

Which is true about VACUUM in Postgres?
A. Reclaims dead tuples and updates visibility map; VACUUM FULL rewrites table.
B. VACUUM removes all rows.
C. VACUUM is same as ANALYZE.
D. VACUUM is used in MySQL only.
✅ Answer: A
Solution: VACUUM reclaims storage; VACUUM FULL rewrites.

35

Which query gets last order per customer (most recent OrderDate) in ANSI SQL?
A. SELECT o1.* FROM Orders o1 WHERE o1.OrderDate = (SELECT MAX(o2.OrderDate) FROM Orders o2 WHERE o2.CustId=o1.CustId)
B. ROW_NUMBER() OVER (PARTITION BY CustId ORDER BY OrderDate DESC) then pick rn=1
C. Both A and B valid; B typically better for ties control.
D. Use GROUP BY Customer and MAX(OrderDate) only to get date, not full row.
✅ Answer: C
Solution: Both options valid; window function gives full row easily.

36

SELECT COUNT(DISTINCT x) FROM t; — which is true for large cardinalities?
A. Exact on SQL engines that support it, but heavy; approximate algorithms exist (HyperLogLog) in some DBs.
B. Always approximate.
C. Returns NULL for empty table.
D. Requires GROUP BY.
✅ Answer: A
Solution: COUNT(DISTINCT) is exact but costly; some DBs provide approximate count functions.

37

UNION vs UNION ALL — which is true?
A. UNION removes duplicates, UNION ALL keeps them and is faster.
B. UNION ALL removes duplicates.
C. Both same cost.
D. UNION preserves order.
✅ Answer: A
Solution: UNION implies dedup sort/hash; UNION ALL appends.

38

Which index helps WHERE LOWER(name) = 'alice' in Postgres?
A. Functional index: CREATE INDEX idx_lower_name ON t(LOWER(name));
B. Normal B-tree on name
C. Partial index only
D. No index can help
✅ Answer: A
Solution: Index on expression matches predicate for case-insensitive search.

39

You need to atomically increment and return a counter in high concurrency. Best approach?
A. Use DB sequence (nextval() in Postgres/Oracle).
B. SELECT max(id) + 1 FROM t then insert — risky.
C. Use SELECT FOR UPDATE lock row and update.
D. Both A (preferred) and C workable; A scales best.
✅ Answer: D
Solution: DB sequences are lockless and scalable; row locking is valid but less scalable.

40

WITH RECURSIVE path(node, depth) AS (...) returns deep recursion; which guard prevents infinite loops?
A. Use DISTINCT or track visited nodes; use WHERE depth < N limit.
B. Use LIMIT without order.
C. Use GROUP BY only.
D. No need, recursive CTE always terminates.
✅ Answer: A
Solution: Prevent cycles by tracking visited or limiting depth.

41

Which is true about INDEX covering a query?
A. When index contains all columns required by query, DB can use index-only scan.
B. Covering index only used for selects without WHERE.
C. Covering indexes increase I/O always.
D. Covering index impossible with composite keys.
✅ Answer: A
Solution: Index-only scans avoid table access if visibility map or MVCC info available.

42

GROUPING SETS ((a),(b)) — what does it compute?
A. Aggregates for grouping by a and separately by b.
B. Cross product.
C. Rollup for both a and b.
D. None
✅ Answer: A
Solution: GROUPING SETS evaluate specified groupings.

43

Which isolation level prevents non-repeatable reads?
A. Repeatable Read
B. Read Committed
C. Read Uncommitted
D. Serializable
✅ Answer: A (and Serializable also prevents)
Solution: Repeatable Read ensures same read in a transaction returns same rows (depending on DB specifics).

44

MERGE conflicts on source duplicates — what to watch for?
A. SQL requires source to have unique matching key per target row; duplicates may cause multiple WHEN MATCHED updates error.
B. Duplicates silently ignored.
C. MERGE always dedups source.
D. MERGE fails only on NULLs.
✅ Answer: A
Solution: Source duplicate matches to same target can cause multiple updates — need grouping or distinct source.

45

You need to pivot daily sales into columns for each product (fixed small set). Which SQL feature helps?
A. CASE with aggregation: SUM(CASE WHEN Product='P1' THEN Qty ELSE 0 END) AS P1
B. PIVOT (T-SQL/Oracle)
C. Both A and B depending on engine
D. Use UNION of repeated selects only
✅ Answer: C
Solution: CASE+AGG works anywhere; PIVOT is syntactic sugar.

46

SELECT a, COUNT(*) OVER (PARTITION BY a) cnt FROM T — what is cnt?
A. Row count per a repeated on each row.
B. Total table rows.
C. Distinct count per a.
D. NULL
✅ Answer: A
Solution: Window COUNT returns count per partition on each row.

47

Which is true about constraints and performance?
A. Foreign key constraints can slow inserts due to verification but provide integrity; consider deferrable constraints for bulk loads.
B. Constraints never affect performance.
C. Constraints always faster.
D. Removing constraints never advisable.
✅ Answer: A
Solution: FK checks incur overhead; deferring can help bulk operations.

48

ALTER TABLE ADD COLUMN x computed as (y+z) — what is virtual/computed column?
A. Stored vs virtual depends on DB — virtual computed not stored, computed on read; stored persists precomputed.
B. Always stored.
C. Never updatable.
D. Cannot be indexed.
✅ Answer: A
Solution: Some DBs allow indexing stored computed columns; virtual computed columns are calculated.

49

You want to add hundreds of millions of rows quickly; which is best?
A. Use bulk COPY/LOAD (COPY in Postgres, LOAD DATA INFILE in MySQL) with minimally required indexes and constraints disabled then rebuild.
B. Insert one by one in client.
C. Use many single-row INSERTs in separate transactions.
D. Use SELECT FROM DUAL repeated.
✅ Answer: A
Solution: Bulk load with indexes off and single transaction or moderate batching is fastest.

50

Which operator yields all combinations of rows between two relations?
A. Cartesian product (CROSS JOIN)
B. INNER JOIN
C. NATURAL JOIN
D. INTERSECT
✅ Answer: A
Solution: CROSS JOIN returns all pairings.

51

EXPLAIN shows Seq Scan vs Index Scan; which is cheaper depends on selectivity and cost parameters. If table small, optimizer chooses Seq Scan — true?
A. True
B. False
C. Only if index exists
D. Only if table is partitioned
✅ Answer: A
Solution: Sequential scan cheaper for small tables or when many rows required.

52

SELECT * FROM t WHERE id BETWEEN 10 AND 20 uses index on id; which scan?
A. Range scan (index seek then scan)
B. Full table scan always
C. Bitmap index only
D. Index can’t be used for BETWEEN
✅ Answer: A
Solution: Index range scans are efficient for BETWEEN.

53

Which statement about NULL ordering in ORDER BY?
A. SQL standard leaves NULL ordering unspecified; some DBs allow NULLS FIRST/LAST.
B. NULLs are always first.
C. NULLs always last.
D. ORDER BY errors on NULL.
✅ Answer: A
Solution: Behavior is DB-specific; use NULLS FIRST/LAST.

54

SELECT * FROM A LEFT JOIN B ON ... WHERE B.key IS NULL returns what?
A. Rows in A with no match in B (anti-join using left join)
B. Rows in B with no match in A
C. Inner join semantics
D. All rows from both
✅ Answer: A
Solution: LEFT JOIN then filter B NULL yields A rows without matching B.

55

CREATE TABLE t AS SELECT ... — which is true?
A. Creates table populated by SELECT; may not carry indexes/constraints from source.
B. Copies constraints automatically.
C. Creates empty table always.
D. Fails if table exists.
✅ Answer: A
Solution: CTAS copies data and columns but not indexes/PKs (depends on engine).

56

To avoid deadlocks when updating multiple rows in consistent order, which is recommended?
A. Acquire locks in same order across transactions, e.g., ORDER BY on key before update.
B. Randomize access order.
C. Use autocommit only.
D. Avoid transactions.
✅ Answer: A
Solution: Consistent locking order reduces deadlocks.

57

SELECT jsonb_array_elements(data->'items') FROM orders produces rows for each array element. Which DB supports this?
A. PostgreSQL
B. MySQL 5.6
C. Oracle 10g
D. SQL Server 2000
✅ Answer: A
Solution: Postgres has jsonb functions to unnest arrays.

58

INSERT ... ON CONFLICT DO UPDATE is a UPSERT in Postgres. What must be present?
A. Unique constraint / index on target columns.
B. Primary key only.
C. Foreign key only.
D. Trigger.
✅ Answer: A
Solution: Conflict target must be constrained by unique/primary key.

59

Which of the following makes aggregation deterministic across partitions?
A. Use GROUPING SETS or GROUP BY and avoid non-deterministic functions.
B. Use ORDER BY inside aggregate without window.
C. Use random seed.
D. Use ROW_NUMBER.
✅ Answer: A
Solution: Aggregation deterministic if grouping deterministic and functions deterministic.

60

Which query counts unique users per day given Log(userId, event_ts)?
A. SELECT date_trunc('day', event_ts) d, COUNT(DISTINCT userId) FROM Log GROUP BY d
B. SELECT event_ts::date, COUNT(userId) FROM Log GROUP BY event_ts::date
C. SELECT DISTINCT userId, event_ts FROM Log
D. SELECT date(event_ts), COUNT(userId) FROM Log GROUP BY date(event_ts)
✅ Answer: A (and D equivalent in some DBs)
Solution: COUNT(DISTINCT userId) grouped by day yields unique user count per day.

61

Which is correct to avoid SQL injection for user-supplied email?
A. Use parameterized/prepared statements binding email as parameter.
B. Escape single quotes manually.
C. Concatenate string after sanitizing.
D. Use dynamic SQL with EXECUTE.
✅ Answer: A
Solution: Parameterized queries prevent injection reliably.

62

Which is true about FOREIGN KEY ... ON DELETE CASCADE?
A. Deletes child rows automatically when parent deleted.
B. Prevents parent deletion always.
C. Moves child rows to default parent.
D. Copies child rows.
✅ Answer: A
Solution: ON DELETE CASCADE removes dependent rows.

63

Which SQL returns rows present in A but not in B efficiently if both large and indexed?
A. Anti-join via LEFT JOIN and IS NULL on B.
B. A EXCEPT B (some DBs implement better).
C. NOT EXISTS correlated subquery possibly optimized to semi-join.
D. All above may be used; B or C are often optimized best.
✅ Answer: D
Solution: Choice depends on optimizer; EXCEPT, NOT EXISTS, or anti-join are common.

64

Which method produces consistent snapshot reads in Postgres?
A. MVCC with REPEATABLE READ/SERIALIZABLE snapshots.
B. Table locking only.
C. Row-level locks always required.
D. No snapshot isolation in Postgres.
✅ Answer: A
Solution: Postgres uses MVCC snapshots.

65

You need to implement rate-limiting per user using DB. Which approach is best?
A. Use a table with userId, window_start, counter and atomic UPDATE ... WHERE ... RETURNING or INSERT ... ON CONFLICT to increment and check.
B. Rely on SELECT then INSERT with sleep.
C. Use temporary files on disk.
D. Use triggers.
✅ Answer: A
Solution: Atomic upsert or increment avoids race conditions.

66

EXPLAIN shows Hash Join chosen — what likely happened?
A. Optimizer determined hashing smaller relation to probe larger one is cheapest given memory.
B. Hash join always chosen over nested loops.
C. Hash join never benefits large tables.
D. Hash join requires indexes.
✅ Answer: A
Solution: Hash join used when equality join and optimizer estimates it cheapest.

67

Which is correct to add a computed persisted column in SQL Server?
A. ALTER TABLE t ADD col AS (a+b) PERSISTED;
B. ADD VIRTUAL computed
C. CREATE INDEX only
D. ALTER cannot do this
✅ Answer: A
Solution: SQL Server supports persisted computed columns.

68

Which of the following increases concurrency for reads during long updates?
A. Use row-level locking and smaller transactions.
B. Use table-level locks.
C. Use autocommit for each row update.
D. Disable indexes while updating.
✅ Answer: A
Solution: Smaller txns and row locks reduce contention.

69

Which SQL constructs allow window frame specification?
A. ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
B. GROUP BY
C. HAVING
D. DISTINCT
✅ Answer: A
Solution: Window functions support frame specs.

70

Which is best practice for indexing foreign keys?
A. Index child table foreign key for fast joins and deletes.
B. Never index foreign keys.
C. Only index parent keys.
D. Index every column regardless.
✅ Answer: A
Solution: Index on child foreign key speeds joins and cascading operations.

71

Which returns true if two JSON objects have same keys and values ignoring order in Postgres?
A. data::jsonb = other::jsonb
B. data = other (text)
C. JSON_EQUALS(data, other)
D. data @> other
✅ Answer: A
Solution: jsonb equality compares structural equality regardless of key order.

72

SELECT * FROM A FULL OUTER JOIN B ON ... returns:
A. All rows from both tables with matching where available and NULL for missing sides.
B. Only matching rows.
C. Cartesian product.
D. Error on unmatched columns.
✅ Answer: A
Solution: FULL OUTER JOIN returns union of left and right rows.

73

Which is true about CHECK constraint referencing other tables?
A. Standard SQL forbids cross-table checks in CHECK constraints. Use triggers.
B. CHECK can do cross-table queries.
C. CHECK can call stored procedures.
D. CHECK is mandatory for referential integrity.
✅ Answer: A
Solution: CHECK must be row-local; cross-table logic requires triggers.

74

You need to ensure uniqueness case-insensitive on name column. Which is good?
A. Create unique functional index on LOWER(name).
B. Use unique constraint on name directly.
C. Use triggers to check on insert.
D. Store names uppercase only.
✅ Answer: A (or D)
Solution: Functional unique index on normalized value enforces case-insensitive uniqueness.

75

Which SQL returns customers with total orders equal to maximum among all customers?
A. SELECT CustId FROM Orders GROUP BY CustId HAVING SUM(Amount) = (SELECT MAX(total) FROM (SELECT SUM(Amount) total FROM Orders GROUP BY CustId) t)
B. SELECT CustId FROM Orders GROUP BY CustId ORDER BY SUM(Amount) DESC LIMIT 1 (may return one)
C. Both A and B (B returns one arbitrary if tie unless use WITH TIES)
D. None
✅ Answer: C
Solution: A exact match returns all ties; B returns top row(s) depends on dialect.

76

Which indexes help ORDER BY queries?
A. Index on columns in ORDER BY with same direction allows index-scan sorted output.
B. No index helps ORDER BY.
C. HASH indexes help ORDER BY.
D. Function indexes always help.
✅ Answer: A
Solution: B-tree indexes in matching order assist ORDER BY avoiding sort.

77

Which is correct about CHECKSUM/HASH for change detection?
A. Hash can detect changed rows quickly but collision risk exists; for guaranteed equality compare full rows.
B. Hash guarantees uniqueness.
C. Hash unnecessary for large tables.
D. Hash slows queries always.
✅ Answer: A
Solution: Hash is probabilistic for change detection.

78

Which is efficient for multi-column search WHERE a=.. AND b=..?
A. Composite index on (a,b).
B. Single index on a and separate on b always equivalent.
C. Bitmap index used in SQL Server newer versions only.
D. No index helps.
✅ Answer: A
Solution: Composite index matches both predicates efficiently.

79

Which statement about ROLLBACK TO SAVEPOINT sp?
A. Rolls back to savepoint inside transaction without aborting transaction.
B. ROLLBACK always aborts whole transaction.
C. Savepoints persist after commit.
D. Savepoints are created automatically.
✅ Answer: A
Solution: Savepoints allow partial rollback.

80

SELECT ... FOR UPDATE SKIP LOCKED helps in what scenario?
A. Implementing job queue where workers skip locked rows to avoid blocking.
B. Always locks all rows.
C. Skips rows randomly.
D. Only for reading, not updating.
✅ Answer: A
Solution: SKIP LOCKED allows concurrent consumers to avoid waiting.

81

Which is true about TEMPORARY TABLE?
A. Visible only in current session and dropped at end of session.
B. Visible to all sessions.
C. Persist after server restart.
D. Always faster than permanent tables.
✅ Answer: A
Solution: Temporary tables scoped to session/transaction depending on DB.

82

Which is preferred to limit returned rows in pagination?
A. Use keyset pagination (WHERE (key > last_key) ORDER BY key LIMIT n) for stable performance.
B. OFFSET/LIMIT always best.
C. Use nested loops.
D. Use COUNT(*) in each page.
✅ Answer: A
Solution: Keyset avoids offset slowness for deep pages.

83

Which is true about VACUUM vs ANALYZE?
A. VACUUM reclaims space; ANALYZE updates planner statistics.
B. They are identical.
C. ANALYZE cleans up dead tuples.
D. Both are DDL.
✅ Answer: A
Solution: Separate maintenance tasks.

84

Which pattern avoids lost updates?
A. UPDATE ... WHERE version = :old_version and check rows affected equals 1 (optimistic concurrency)
B. Always use SELECT then UPDATE without locks (risky)
C. Never update in transactions
D. Use triggers only
✅ Answer: A
Solution: Optimistic locking by version column prevents lost updates.

85

Which approach is best to maintain materialized view freshness with high update rate?
A. Incremental refresh (if supported) updates delta only.
B. Recreate view entirely every minute.
C. Never refresh.
D. Use triggers to copy all rows.
✅ Answer: A
Solution: Incremental refresh uses deltas for efficiency.

86

Which SQL ensures referential integrity for historical data even if parent rows deleted?
A. Use ON DELETE NO ACTION and never delete parents; archive instead; or use ON DELETE SET NULL to retain child with history.
B. Use ON DELETE CASCADE to delete children.
C. Use ON DELETE RESTRICT only.
D. Use triggers to block deletion.
✅ Answer: A (and D depending on need)
Solution: To keep history do not cascade deletes or copy parent information into child.

87

Which built-in SQL function rounds toward zero?
A. TRUNC() in many DBs truncates toward zero for positive/negative numbers.
B. ROUND() always truncates.
C. FLOOR() rounds toward zero.
D. CEIL() rounds toward zero.
✅ Answer: A
Solution: TRUNC removes fractional part (toward zero).

88

Which SQL returns nth highest value without window functions?
A. SELECT DISTINCT val FROM t t1 WHERE (SELECT COUNT(DISTINCT val) FROM t t2 WHERE t2.val > t1.val) = n-1
B. ORDER BY val DESC LIMIT 1 OFFSET n-1 (if supported)
C. Both A and B depending on dialect
D. None
✅ Answer: C
Solution: Subquery approach generic; LIMIT/OFFSET simpler where supported.

89

Which is true about CHECKSUM of a row for replication?
A. Useful to quickly detect differences but collisions possible.
B. Perfect guarantee of equality.
C. Not usable on binary columns.
D. Changes only on schema changes.
✅ Answer: A
Solution: Hash-based detection is probabilistic.

90

Which approach to split large table for parallel ingestion?
A. Partitioning by range or hash to parallelize bulk loads.
B. Single table always better.
C. Use many temp files without partitioning.
D. Partitioning prevents parallel ingestion.
✅ Answer: A
Solution: Partitioning enables multiple writers and parallel loads.

91

Which is correct to prevent long running queries hogging resources?
A. Set statement_timeout or resource governor in DB to limit time.
B. Let queries run always.
C. Kill server process manually.
D. Use triggers to kill long queries.
✅ Answer: A
Solution: DB configs allow statement timeouts.

92

Which SQL returns differences between successive rows ordered by ts per id?
A. Use LAG(value) OVER (PARTITION BY id ORDER BY ts) then subtract.
B. Use correlated subquery SELECT MAX(ts) < current
C. Use GROUP BY only
D. Using DISTINCT
✅ Answer: A
Solution: Window LAG provides prior row for diff.

93

Which is true about ROW_NUMBER(), RANK(), DENSE_RANK()?
A. ROW_NUMBER() unique sequence; RANK() leaves gaps on ties; DENSE_RANK() no gaps.
B. All identical.
C. Only ROW_NUMBER available.
D. None of above.
✅ Answer: A
Solution: Distinct tie behavior defines differences.

94

Which approach is best to maintain search on free text in Postgres?
A. Use GIN index on to_tsvector(col) and query with to_tsquery.
B. LIKE ‘%term%’ is fast.
C. Use heavy regex always.
D. No index helps.
✅ Answer: A
Solution: Full-text search with tsvector + GIN index is efficient.

95

Which is true about EXISTS vs IN?
A. EXISTS checks row existence and short-circuits; correlated EXISTS can be more efficient. IN with subquery may be optimized into semi-join.
B. IN faster always.
C. EXISTS requires distinct.
D. IN cannot use indexes.
✅ Answer: A
Solution: Choice depends on query and DB optimizer.

96

Which SQL returns moving average of last 7 days per product?
A. AVG(sales) OVER (PARTITION BY product ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
B. AVG(sales) with GROUP BY date, product only
C. Correlated subquery summing last 7 days per row
D. Both A and C; A is preferable using window frames.
✅ Answer: D
Solution: Window frame is concise and efficient.

97

Which statement about PRIMARY KEY is true?
A. Primary key enforces uniqueness and NOT NULL; DB creates unique index.
B. Primary key allows duplicates.
C. Primary key is optional.
D. Primary key cannot be composite.
✅ Answer: A
Solution: PK enforces uniqueness and non-nullness; composite allowed.

98

Which is correct about CHECK constraint involving string length?
A. CHECK (char_length(name) <= 50) enforces length at DB level.
B. Use VARCHAR(50) only; CHECK redundant.
C. CHECK not applied to strings.
D. Use triggers only.
✅ Answer: A
Solution: CHECK enforces condition regardless of column type; useful to ensure constraints beyond type.

99

Which is correct about DISTINCT ON (expr) in Postgres?
A. Returns first row per expr based on ORDER BY; nonstandard extension.
B. Same as DISTINCT.
C. Available in all DBs.
D. Forces grouping.
✅ Answer: A
Solution: Postgres-specific to pick first row per distinct key.

100

Which approach helps scale analytical queries on large tables?
A. Create summary/aggregate tables (materialized views) refreshed periodically.
B. Run analytics directly on OLTP tables always.
C. Use triggers for every transaction to update analytics instantly (may be slow).
D. Use nested loops for all joins.
✅ Answer: A
Solution: Materialized summaries reduce scan and compute cost.

Q1.

Table Sales(sid, cust_id, amount, sdate) has rows for 2024. Which query returns customers whose total sales in March 2024 exceeded ₹12,345?
A. SELECT cust_id FROM Sales WHERE EXTRACT(MONTH FROM sdate)=3 GROUP BY cust_id HAVING SUM(amount)>12345;
B. SELECT cust_id FROM Sales GROUP BY cust_id HAVING SUM(amount WHERE EXTRACT(MONTH FROM sdate)=3)>12345;
C. SELECT cust_id FROM Sales WHERE MONTH(sdate)=3 AND SUM(amount)>12345;
D. SELECT cust_id FROM Sales WHERE sdate BETWEEN '2024-03-01' AND '2024-03-31' GROUP BY cust_id HAVING SUM(amount)>12345;
✅ Answer: D
💡 Solution: Option D filters by date range first, then groups and uses HAVING. A may be valid in some DBs if EXTRACT is allowed in WHERE; it is OK — but D is most portable. B is invalid syntax (SUM with WHERE inside). C misuses aggregate in WHERE.

Q2.

emp(eid, dept, salary) — find third highest distinct salary overall (ANSI SQL). Which is portable?
A. SELECT MAX(salary) FROM emp WHERE salary < (SELECT MAX(salary) FROM emp)
B. SELECT DISTINCT salary FROM emp ORDER BY salary DESC LIMIT 1 OFFSET 2
C. SELECT MIN(salary) FROM (SELECT DISTINCT salary FROM emp ORDER BY salary DESC FETCH FIRST 3 ROWS WITH TIES) t
D. SELECT salary FROM emp GROUP BY salary HAVING COUNT(*)=3
✅ Answer: B
💡 Solution: B returns the 3rd distinct using LIMIT/OFFSET (dialect-specific). A gives 2nd highest. C is complex and incorrect for exact 3rd distinct. D is wrong. For ANSI-only, use window functions with DENSE_RANK(); but among options B is correct approach.

Q3.

Tables: Orders(ord_id, cust_id, qty, price); you want revenue per order and then only orders where revenue > 5000. Which SQL is correct?
A. SELECT ord_id, qty*price AS rev FROM Orders HAVING rev>5000;
B. SELECT ord_id, qty*price AS rev FROM Orders WHERE qty*price>5000;
C. SELECT ord_id, SUM(qty*price) AS rev FROM Orders GROUP BY ord_id HAVING SUM(qty*price)>5000;
D. SELECT ord_id, qty*price AS rev FROM Orders GROUP BY ord_id HAVING rev>5000;
✅ Answer: B (and C if each order may have multiple rows)
💡 Solution: If one row per order, B is simplest. If multiple rows per ord_id, need aggregation in C. A and D misuse HAVING without GROUP.

Q4.

products(pid, price) — which index helps WHERE price BETWEEN 100 AND 200?
A. B-tree index on price
B. Hash index on price
C. Bitmap index only
D. No index helps range queries
✅ Answer: A
💡 Solution: B-tree supports range scans; hash indexes generally don’t support range queries.

Q5.

Given T(a,b) with duplicate rows, which returns rows where a appears with every b value present in U(b)?
A. SELECT a FROM T GROUP BY a HAVING COUNT(DISTINCT b) = (SELECT COUNT(DISTINCT b) FROM U);
B. SELECT a FROM T WHERE b IN (SELECT b FROM U) GROUP BY a;
C. SELECT DISTINCT a FROM T WHERE NOT EXISTS (SELECT 1 FROM U WHERE U.b NOT IN (SELECT b FROM T WHERE T.a = a));
D. SELECT a FROM T WHERE b = ALL (SELECT b FROM U);
✅ Answer: A
💡 Solution: Division logic: count distinct b per a equals count of U’s distinct b. C is logically doable but more complex; D usually invalid since ALL expects scalar comparisons.

Q6.

employees(id, manager_id) store org chart. Which SQL lists employees who are managers (i.e., have at least one direct report)?
A. SELECT DISTINCT manager_id FROM employees WHERE manager_id IS NOT NULL;
B. SELECT e.* FROM employees e WHERE EXISTS (SELECT 1 FROM employees r WHERE r.manager_id=e.id);
C. SELECT id FROM employees WHERE id IN (SELECT manager_id FROM employees);
D. All of the above (with slight differences)
✅ Answer: D
💡 Solution: A returns manager_ids only (null filtered). B returns full employee rows. C returns manager ids. All are valid depending on desired columns.

Q7.

Which is the best way to remove duplicate rows from log(uid, event, created_at) retaining the earliest created_at per (uid,event) in Postgres?
A. DELETE FROM log WHERE ctid NOT IN (SELECT min(ctid) FROM log GROUP BY uid, event);
B. WITH t AS (SELECT ctid FROM (SELECT ctid, ROW_NUMBER() OVER (PARTITION BY uid,event ORDER BY created_at) rn FROM log) s WHERE rn>1) DELETE FROM log WHERE ctid IN (SELECT ctid FROM t);
C. DELETE FROM log WHERE created_at > (SELECT MIN(created_at) FROM log WHERE log.uid=log.uid AND log.event=log.event);
D. TRUNCATE log;
✅ Answer: B
💡 Solution: B uses ROW_NUMBER to mark duplicates preserving earliest. A with min(ctid) unpredictable relative to created_at. C is correlated incorrectly referencing same table alias; D deletes all.

Q8.

You need running total of amt ordered by ts per acct. Which SQL is succinct and efficient?
A. SELECT acct, ts, SUM(amt) OVER (PARTITION BY acct ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS run_total FROM tx;
B. Correlated subquery summing prior rows per row
C. Self-join on earlier timestamps with aggregation
D. Both A and B; A is preferred (window function)
✅ Answer: D (prefer A)
💡 Solution: Window functions give concise, efficient running totals; correlated subqueries are less efficient.

Q9.

customers(cust, region) and orders(ord, cust, amt). Which returns top 2 customers by total amt per region?
A. ROW_NUMBER() OVER (PARTITION BY region ORDER BY SUM(amt) DESC) in SELECT — invalid directly
B. SELECT cust, region, total FROM (SELECT c.cust, c.region, SUM(o.amt) total, ROW_NUMBER() OVER (PARTITION BY c.region ORDER BY SUM(o.amt) DESC) rn FROM customers c JOIN orders o ON c.cust=o.cust GROUP BY c.cust, c.region) t WHERE rn<=2;
C. GROUP BY region, cust HAVING RANK<=2 — invalid syntax
D. SELECT cust FROM orders GROUP BY cust ORDER BY SUM(amt) DESC LIMIT 2
✅ Answer: B
💡 Solution: Use aggregation plus window ROW_NUMBER() over aggregation result. Option B correctly groups then assigns row numbers per region.

Q10.

Which isolation level can cause lost updates?
A. Read Uncommitted
B. Read Committed
C. Repeatable Read
D. Serializable
✅ Answer: A (and sometimes B depending on update patterns)
💡 Solution: Read Uncommitted can see uncommitted changes and lead to lost updates; Read Committed may allow lost updates without appropriate locking/optimistic check.

Q11.

items(id, tags) where tags is a comma-separated string like 'eco,summer,42'. Find rows that include 'summer' as a whole tag (not substring) portably. Which WHERE is safest?
A. WHERE tags LIKE '%summer%'
B. WHERE ',' || tags || ',' LIKE '%,summer,%'
C. WHERE tags ~ '(^|,)summer(,|$)' (regex)
D. B or C depending on DB; B works in standard SQL with string concat functions.
✅ Answer: D
💡 Solution: Simple LIKE can match substrings; surrounding commas or regex ensures whole token match. Use normalized schema (separate table) ideally.

Q12.

Which SQL removes all rows older than 90 days in events(ts) efficiently on huge table?
A. DELETE FROM events WHERE ts < now() - interval '90 days'; in small batched transactions
B. Drop and recreate table daily
C. Partition older data by range and ALTER TABLE DROP PARTITION — fastest
D. Both A (batched) and C (best if partitioned)
✅ Answer: D (C preferred)
💡 Solution: For very large tables partitioning and dropping old partitions is fastest and minimal logging; without partitioning, batched deletes with index on ts recommended.

Q13.

users(id, email) with unique case-insensitive emails required. Best DB design choice?
A. Create unique index on LOWER(email) (functional unique index).
B. Use unique on email column without normalization.
C. Use trigger to check duplicates case-insensitively.
D. Store uppercase emails only.
✅ Answer: A (or D)
💡 Solution: Functional unique index on normalized email enforces case-insensitive uniqueness; storing normalized values also works.

Q14.

Given sales(id, price, qty), compute revenue and return only orders whose revenue equals maximum revenue. Choose correct SQL:
A. WITH r AS (SELECT id, price*qty rev FROM sales) SELECT id FROM r WHERE rev = (SELECT MAX(rev) FROM r);
B. SELECT id FROM sales WHERE price*qty = (SELECT MAX(price*qty) FROM sales);
C. Both A and B valid; B executes subexpression per row unless optimized; A materializes.
D. Neither works.
✅ Answer: C
💡 Solution: Both produce correct result; C explains performance nuance—CTE may materialize in some DBs.

Q15.

employees(eid, dept, sal) — which query returns departments where all employees earn > ₹40,000?
A. SELECT dept FROM employees GROUP BY dept HAVING MIN(sal) > 40000;
B. SELECT dept FROM employees WHERE sal > 40000;
C. SELECT DISTINCT dept FROM employees WHERE NOT EXISTS (SELECT 1 FROM employees e2 WHERE e2.dept=employees.dept AND e2.sal<=40000);
D. A and C equivalent
✅ Answer: D
💡 Solution: Using HAVING with MIN checks all salaries; NOT EXISTS is an alternative.

Q16.

Which query produces a materialized top-N per group efficiently (e.g., top 5 orders by amt per seller) in databases lacking advanced window frames?
A. Correlated subquery with SELECT COUNT(*) FROM orders o2 WHERE o2.seller=o1.seller AND o2.amt>o1.amt and filter <5
B. Self-join with group and filter using >= comparisons — complicated
C. Compute using temporary table with sorted partitions and ROW_NUMBER() emulation
D. A or C depending on DB; A portable but may be slow; C often preferred with window functions.
✅ Answer: D
💡 Solution: Correlated subquery technique is portable; window functions are better if available.

Q17.

SELECT COALESCE(AVG(val),0) FROM t; — what does it return if t empty?
A. 0
B. NULL
C. Error
D. Empty set
✅ Answer: A
💡 Solution: AVG(empty) = NULL; COALESCE replaces NULL with 0.

Q18.

Which index is best for queries: WHERE last_name = 'Kaur' AND first_name = 'Nita' ORDER BY created_at DESC LIMIT 1?
A. Composite index (last_name, first_name, created_at DESC)
B. Single index on created_at only
C. Separate indexes on names only
D. Hash index on names
✅ Answer: A
💡 Solution: Composite index matching WHERE columns and ORDER BY provides index-only scan for last record.

Q19.

Which SQL returns customers who placed orders in every month of 2023? (orders(ord_id, cust, ord_date))
A. SELECT cust FROM orders WHERE ord_date BETWEEN '2023-01-01' AND '2023-12-31' GROUP BY cust HAVING COUNT(DISTINCT EXTRACT(MONTH FROM ord_date)) = 12;
B. SELECT cust FROM orders GROUP BY cust HAVING COUNT(DISTINCT DATE_TRUNC('month', ord_date)) = 12;
C. Both A and B valid (dialect variation)
D. Use DIVIDE operator
✅ Answer: C
💡 Solution: Use DISTINCT month count equals 12; function differs by DB.

Q20.

Which query returns rows in A that have no match in B efficiently?
A. SELECT a.* FROM A LEFT JOIN B ON A.k=B.k WHERE B.k IS NULL;
B. SELECT * FROM A WHERE NOT EXISTS (SELECT 1 FROM B WHERE B.k=A.k);
C. SELECT * FROM A WHERE A.k NOT IN (SELECT k FROM B); (danger with NULLs)
D. A or B are preferred; C can be wrong if B.k contains NULL.
✅ Answer: D
💡 Solution: LEFT JOIN+IS NULL or NOT EXISTS are safe anti-joins; NOT IN fails with NULLs in subquery.

Q21.

What does EXPLAIN ANALYZE provide compared to plain EXPLAIN in Postgres?
A. Actual execution timing and row counts (it executes the query)
B. Only estimated costs and plan trees
C. Syntax check only
D. It modifies tables
✅ Answer: A
💡 Solution: EXPLAIN ANALYZE runs the query and returns actual statistics.

Q22.

Which is correct to atomically increment a counter and get the new value in Postgres?
A. UPDATE counters SET val = val + 1 WHERE id=1 RETURNING val;
B. SELECT val FROM counters; UPDATE counters SET val = val + 1;
C. SELECT MAX(val)+1 FROM counters;
D. Use nextval() sequence function instead (preferred)
✅ Answer: D (A is valid but sequence is best for concurrency)
💡 Solution: Sequences are lockless and scalable; UPDATE … RETURNING works but can have contention.

Q23.

You have orders with status values; you want counts per status including statuses with zero orders (from table status_list). Which is correct?
A. SELECT s.status, COUNT(o.*) FROM status_list s LEFT JOIN orders o ON o.status=s.status GROUP BY s.status;
B. SELECT o.status, COUNT(*) FROM orders o RIGHT JOIN status_list s ON o.status=s.status GROUP BY o.status;
C. A and B equivalent if GROUP BY uses s.status properly — A is standard.
D. Use CROSS JOIN.
✅ Answer: C (A is best)
💡 Solution: LEFT JOIN from status_list ensures even zero-count statuses appear; use COUNT(o.*) or COUNT(o.status).

Q24.

Given t(a,b) with a unique, which DELETE removes rows where b duplicates exist and keeps only smallest a per b?
A. DELETE FROM t WHERE a NOT IN (SELECT MIN(a) FROM t GROUP BY b);
B. DELETE FROM t WHERE a IN (SELECT MIN(a) FROM t GROUP BY b);
C. DELETE FROM t WHERE EXISTS (SELECT 1 FROM t t2 WHERE t2.b=t.b AND t2.a < t.a);
D. A or C (A deletes non-min rows; C is correlated approach).
✅ Answer: D
💡 Solution: A retains min a per b; C deletes rows having smaller a present; B wrong.

Q25.

Which of these is true about NULL handling in aggregates?
A. COUNT(*) counts rows including NULLs; COUNT(column) counts non-NULL values only.
B. SUM(NULL) returns 0.
C. AVG counts NULL as zero.
D. MAX(NULL) returns zero.
✅ Answer: A
💡 Solution: Aggregates skip NULLs except COUNT(*).

Q26.

orders(ord, cust, total) — you want the median order total. Which technique is valid in Postgres?
A. SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY total) FROM orders;
B. SELECT AVG(total) FROM orders;
C. SELECT total FROM orders ORDER BY total LIMIT 1 OFFSET (SELECT COUNT(*)/2 FROM orders); (works for odd counts)
D. A or C (A handles continuous median; C is discrete approach)
✅ Answer: D
💡 Solution: percentile_cont is built-in in Postgres; OFFSET approach is manual and needs DISTINCT handling for ties.

Q27.

Which command creates an index that enforces uniqueness on lowercased email in Postgres?
A. CREATE UNIQUE INDEX idx_uq_email_lower ON users (LOWER(email));
B. CREATE INDEX idx_email ON users(email);
C. ALTER TABLE users ADD UNIQUE (email);
D. CREATE UNIQUE INDEX idx_uq_email_lower ON users (email COLLATE "C");
✅ Answer: A
💡 Solution: Functional unique index on LOWER(email) enforces case-insensitive uniqueness.

Q28.

SELECT * FROM A JOIN B USING (k) — how does USING differ from ON A.k=B.k?
A. USING will output single k column instead of two k columns.
B. USING does not allow NATURAL join.
C. USING duplicates join columns.
D. They are identical in projection.
✅ Answer: A
💡 Solution: USING coalesces join column in result.

Q29.

Which approach implements pagination efficiently for deep pages?
A. Keyset pagination (seek method) using WHERE on last seen key.
B. OFFSET … LIMIT large offset.
C. Use ORDER BY random().
D. Use nested correlated subqueries.
✅ Answer: A
💡 Solution: Keyset avoids skipping many rows and scales better; OFFSET degrades for deep pages.

Q30.

Which statement about FOREIGN KEY with ON DELETE SET NULL is correct?
A. Deleting parent sets child FK to NULL, preserving child row.
B. It deletes child rows.
C. It prevents parent deletion.
D. It duplicates child rows.
✅ Answer: A
💡 Solution: ON DELETE SET NULL sets FK to NULL on parent delete; useful when keeping history.

Q31.

Which SQL computes percentage share of each product’s sales over total sales (rounded to 2 decimals)?
A. SELECT product, ROUND(100.0*SUM(amount)/SUM(SUM(amount)) OVER (),2) pct FROM sales GROUP BY product;
B. SELECT product, SUM(amount)/SUM(amount) total FROM sales GROUP BY product;
C. SELECT product, 100*SUM(amount)/(SELECT SUM(amount) FROM sales) FROM sales GROUP BY product;
D. A and C both valid; A uses window function, C uses scalar subquery.
✅ Answer: D
💡 Solution: Both calculate percentage; A may be more efficient in single scan.

Q32.

Which is safer to avoid race conditions when inserting a row only if not exists (upsert) in Postgres?
A. INSERT ... ON CONFLICT (key) DO NOTHING
B. IF NOT EXISTS (SELECT 1 FROM t WHERE key=...) THEN INSERT in client — vulnerable to race
C. Use SELECT FOR UPDATE on key then insert
D. A or C (A preferred)
✅ Answer: D
💡 Solution: ON CONFLICT is atomic and preferred.

Q33.

Which query gives the number of distinct customers per day given orders(cust, odate)?
A. SELECT odate::date d, COUNT(DISTINCT cust) FROM orders GROUP BY d;
B. SELECT odate, COUNT(cust) FROM orders GROUP BY odate;
C. SELECT DISTINCT cust, odate FROM orders
D. SELECT COUNT(DISTINCT odate) FROM orders
✅ Answer: A
💡 Solution: Cast/truncate to date and count distinct cust per day.

Q34.

SELECT * FROM t WHERE id = ANY(ARRAY[1,2,3]) — what does = ANY mean?
A. Equivalent to IN (1,2,3)
B. Equivalent to = only first element
C. Is valid only in Oracle
D. Returns true if id equals all elements
✅ Answer: A
💡 Solution: = ANY(array) returns true if equal to any element; similar to IN.

Q35.

Which method yields a non-blocking bulk delete strategy?
A. Delete in small batches (e.g., 10k rows) committing between batches.
B. Single huge DELETE in single transaction.
C. TRUNCATE with WHERE condition.
D. Drop database.
✅ Answer: A
💡 Solution: Batch deletes reduce lock contention and log pressure.

Q36.

Which SQL returns rows where jsonb column data has key status = 'ok' (Postgres)?
A. SELECT * FROM t WHERE data->>'status' = 'ok';
B. SELECT * FROM t WHERE data @> '{"status":"ok"}';
C. Both A and B correct, B uses containment operator and may use GIN index.
D. SELECT * FROM t WHERE JSON_VALUE(data, '$.status') = 'ok' (non-Postgres)
✅ Answer: C
💡 Solution: Both match JSON value; @> supports GIN index on jsonb.

Q37.

Which is true about VARCHAR(n) length enforcement?
A. DB enforces at insert/update; exceeding values error/truncate depending on mode.
B. Always truncates silently.
C. Has no effect.
D. Only constraints do length checks.
✅ Answer: A
💡 Solution: Behavior depends on DB — many error, some truncate if configured.

Q38.

Given events(user_id, ts), find users with gaps > 7 days between consecutive events (per user): which helps?
A. Use LAG(ts) OVER (PARTITION BY user_id ORDER BY ts) and filter where ts - prev_ts > interval '7 days'.
B. Use correlated subquery to find previous ts per row — slower.
C. Window function is succinct and performant.
D. All of the above; A preferred.
✅ Answer: D (A preferred)
💡 Solution: LAG simplifies detecting gaps; correlated subquery works but slower.

Q39.

Which is true about VACUUM FULL in Postgres?
A. It rewrites the table and requires exclusive lock.
B. It is non-blocking.
C. It only updates statistics.
D. It deletes all indexes.
✅ Answer: A
💡 Solution: VACUUM FULL rewrites and locks table exclusively; VACUUM (non-FULL) is less intrusive.

Q40.

Which SQL pattern avoids duplicate insert on unique constraint violation and updates instead (Upsert) in Oracle?
A. MERGE INTO t USING (SELECT :k key, :v val FROM dual) src ON (t.key=src.key) WHEN MATCHED THEN UPDATE ... WHEN NOT MATCHED THEN INSERT ...;
B. INSERT ... ON CONFLICT (Postgres syntax) — invalid in Oracle native.
C. Both A is correct for Oracle.
D. Use stored procedure only.
✅ Answer: C (A for Oracle)
💡 Solution: MERGE is ANSI and supported in Oracle for upsert-like behavior.

Q41.

Which of the following prevents phantom reads?
A. Serializable isolation
B. Read Committed
C. Read Uncommitted
D. Snapshot isolation may or may not depending on DB semantics
✅ Answer: A
💡 Solution: Serializable prevents phantoms; snapshot/serializable distinctions vary.

Q42.

SELECT country, COUNT(*) FROM users GROUP BY country HAVING COUNT(*) > (SELECT AVG(cnt) FROM (SELECT COUNT(*) cnt FROM users GROUP BY country) t); — what does it compute?
A. Countries with above-average user counts (average across countries)
B. Invalid query
C. Countries with counts greater than global average per user
D. None
✅ Answer: A
💡 Solution: Inner subquery computes per-country counts then averages them; HAVING compares country count to that average.

Q43.

Which index is likely used for query WHERE UPPER(username) = 'ALICE'?
A. Functional index CREATE INDEX ix ON users (UPPER(username));
B. Normal index on username (may be used in some collations)
C. No index possible with UPPER unless functional index exists
D. All of above depending on DB and collation
✅ Answer: D (A is explicit)
💡 Solution: Functional index ensures use; some collations/ci indices may already provide case-insensitivity.

Q44.

Which SQL returns rows whose tags array (Postgres text[]) contains both 'x' and 'y'?
A. WHERE tags @> ARRAY['x','y']
B. WHERE 'x' = ANY(tags) AND 'y' = ANY(tags)
C. Both A and B valid; A uses containment operator and is indexable with GIN.
D. WHERE tags LIKE '%x%' AND tags LIKE '%y%'
✅ Answer: C
💡 Solution: @> is efficient with GIN index on arrays.

Q45.

Which is correct about CHECK vs triggers for integrity?
A. CHECK is declarative and faster; triggers are more flexible for cross-row/table checks.
B. Triggers always faster.
C. CHECK can validate other tables.
D. Triggers cannot prevent inserts.
✅ Answer: A
💡 Solution: Use CHECK for row-level constraints, triggers for complex validations.

Q46.

EXPLAIN shows repeated Seq Scan on a small base relation in nested loop join with large relation — why?
A. Planner chose nested loop; small inner repeated per outer row — may be costly if repeated scans instead of index.
B. Always optimal.
C. Because no join condition given.
D. Because the table is empty.
✅ Answer: A
💡 Solution: Nested loop may repeatedly scan small relation if not cached/indexed; consider hash join or index for inner scan.

Q47.

Which SQL counts rows satisfying condition across partitions without GROUP BY?
A. COUNT(*) FILTER (WHERE status='ok') (Postgres) along with COUNT(*) — aggregate with FILTER clause.
B. SUM(CASE WHEN status='ok' THEN 1 ELSE 0 END) — portable.
C. Both A and B valid; B portable.
D. Use window functions only.
✅ Answer: C
💡 Solution: FILTER is concise in some DBs; CASE works everywhere.

Q48.

You have transactions(txn_id, acct, amount) and want to detect duplicates where same acct and amount occur within 5 seconds. Best approach?
A. Use window function LAG partitioned by acct ordered by ts and check ts - prev_ts <= interval '5 seconds' AND amount=prev_amount.
B. Correlated subquery for each row — slower.
C. Self-join with time range conditions — possible but heavier.
D. A preferred for clarity and efficiency.
✅ Answer: D
💡 Solution: LAG provides prior row for each account enabling O(n) detection.

Q49.

Which is correct when creating many indexes before bulk load?
A. Defer index creation until after load for faster bulk insertion (build index once).
B. Create indexes before loading — always faster.
C. Index creation order irrelevant.
D. Indexes don’t affect load performance.
✅ Answer: A
💡 Solution: Building indexes after large insert is usually faster than maintaining them per insert.

Q50.

Which SQL returns rows with pivot (products as columns) for fixed small set ('P1','P2','P3')?
A. SELECT cust, SUM(CASE WHEN product='P1' THEN qty ELSE 0 END) AS P1, SUM(CASE WHEN product='P2' THEN qty ELSE 0 END) AS P2, SUM(CASE WHEN product='P3' THEN qty ELSE 0 END) AS P3 FROM sales GROUP BY cust;
B. USE PIVOT syntax only available in some DBs — not portable.
C. A is portable and correct.
D. None of the above
✅ Answer: C
💡 Solution: CASE + SUM is portable pivot technique for fixed set.

Q51.

Table Payments(pid, user_id, amt, paid_on) — find users whose average payment strictly exceeds ₹2,500 and have made at least 4 payments. Which SQL is correct?
A. SELECT user_id FROM Payments GROUP BY user_id HAVING AVG(amt)>2500 AND COUNT(*)>=4;
B. SELECT user_id FROM Payments WHERE AVG(amt)>2500 AND COUNT(*)>=4 GROUP BY user_id;
C. SELECT DISTINCT user_id FROM Payments WHERE AVG(amt)>2500;
D. SELECT user_id FROM Payments GROUP BY user_id HAVING SUM(amt)/COUNT(*)>2500 AND COUNT(*)>=4;
✅ Answer: A
💡 Solution: HAVING filters grouped results; D is equivalent to A; B/C misuse aggregate in WHERE or missing GROUP BY.

Q52.

You want to delete rows from Session(id, user_id, started_at) keeping only the most recent session per user. Which DELETE works in Postgres?
A. DELETE FROM Session s WHERE s.id NOT IN (SELECT id FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY started_at DESC) rn FROM Session) t WHERE rn=1);
B. DELETE FROM Session WHERE id NOT IN (SELECT MAX(id) FROM Session GROUP BY user_id);
C. TRUNCATE Session;
D. Both A and B workable if id ordering aligns with started_at; A safest (explicit).
✅ Answer: D (A safest)
💡 Solution: A uses ROW_NUMBER to keep recent per user; B assumes MAX(id) equals most recent — may hold if id monotonic.

Q53.

Which expression gives the difference between two tables A and B (rows in A not in B) in SQL standard?
A. SELECT * FROM A EXCEPT SELECT * FROM B
B. SELECT * FROM A WHERE NOT EXISTS (SELECT 1 FROM B WHERE B.key=A.key)
C. SELECT * FROM A LEFT JOIN B ON ... WHERE B.key IS NULL
D. All of the above (semantically equivalent modulo duplicates/NULL behavior)
✅ Answer: D
💡 Solution: EXCEPT removes duplicates; NOT EXISTS and LEFT JOIN/IS NULL implement anti-join; semantics differ slightly with NULLs and duplicates.

Q54.

Given inventory(product_id, qty) and sales(product_id, sold_qty) (multiple sales rows), compute ending stock per product: inventory.qty - SUM(sales.sold_qty). Which SQL is correct to include products with zero sales?
A. SELECT i.product_id, i.qty - COALESCE(SUM(s.sold_qty),0) AS remaining FROM inventory i LEFT JOIN sales s ON i.product_id=s.product_id GROUP BY i.product_id, i.qty;
B. SELECT i.product_id, i.qty - SUM(s.sold_qty) FROM inventory i JOIN sales s ON ... GROUP BY i.product_id;
C. Use RIGHT JOIN from sales to inventory
D. A or B equivalent
✅ Answer: A
💡 Solution: LEFT JOIN keeps inventory-only products; COALESCE handles NULL sums.

Q55.

Which optimizer hint or technique is useful when planner underestimates row counts causing bad join order?
A. Increase statistics (ANALYZE), adjust planner statistics target for skewed columns.
B. Use hints to force join order if available.
C. Create better indexes or histograms.
D. All of the above.
✅ Answer: D
💡 Solution: Updating stats and using hints or indexes help planner choose better plans.

Q56.

Which SQL finds customers who ordered both product 101 and 102? (orders(cust, pid))
A. SELECT cust FROM orders WHERE pid IN (101,102) GROUP BY cust HAVING COUNT(DISTINCT pid)=2;
B. SELECT cust FROM orders WHERE pid=101 INTERSECT SELECT cust FROM orders WHERE pid=102;
C. SELECT o1.cust FROM orders o1 JOIN orders o2 ON o1.cust=o2.cust WHERE o1.pid=101 AND o2.pid=102;
D. All of the above (equivalent)
✅ Answer: D
💡 Solution: All forms return customers who ordered both; A and C common; INTERSECT also valid.

Q57.

Which SQL returns the number of orders placed each weekday (Mon–Sun) using order_ts?
A. SELECT EXTRACT(DOW FROM order_ts) AS dow, COUNT(*) FROM orders GROUP BY dow ORDER BY dow;
B. SELECT TO_CHAR(order_ts,'Day'), COUNT(*) FROM orders GROUP BY 1;
C. Both A and B depending on DB (extract vs to_char)
D. Use weekday() function only
✅ Answer: C
💡 Solution: Use built-in date functions; syntax varies by system.

Q58.

Which is true about creating a covering index for query SELECT name, age FROM people WHERE city='Pune' ORDER BY age?
A. Composite index (city, age, name) is covering and supports ordering by age for that filter.
B. Index (age) only suffices.
C. Index on city only is enough.
D. Covering index cannot include name.
✅ Answer: A
💡 Solution: Index containing all needed columns avoids table lookup and supports ORDER BY on included columns.

Q59.

users(id, last_login) want to find users inactive for > 180 days and lock them for update to process. Which SQL pattern is safe for concurrency?
A. SELECT id FROM users WHERE last_login < now() - interval '180 days' FOR UPDATE SKIP LOCKED; then process rows one worker at a time.
B. SELECT ... then update without locking.
C. TRUNCATE inactive users.
D. Use FOR SHARE only.
✅ Answer: A
💡 Solution: FOR UPDATE SKIP LOCKED allows multiple workers to safely claim rows without blocking.

Q60.

Which SQL returns the cumulative count of purchases per customer ordered by purchase date?
A. SELECT customer, ts, COUNT(*) OVER (PARTITION BY customer ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_count FROM purchases;
B. SELECT customer, ts, SUM(1) OVER (PARTITION BY customer ORDER BY ts)
C. Both A and B equivalent (SUM(1) or COUNT()) D. Use correlated subquery only ✅ Answer: C 💡 Solution: COUNT() or SUM(1) with window frame produce running count.

Q61.

What does CREATE UNIQUE INDEX ix ON t (a) WHERE deleted_at IS NULL; do?
A. Enforces uniqueness only for rows not soft-deleted (deleted_at IS NULL).
B. Enforces uniqueness for all rows.
C. Partial unique index only in some DBs.
D. A and C (Postgres supports partial unique indexes).
✅ Answer: D
💡 Solution: Partial unique index enforces uniqueness only for rows that satisfy predicate.

Q62.

Which is best to detect data drift where hash column stored per row compared to recomputed hash?
A. SELECT id FROM t WHERE stored_hash <> MD5(col1 || '|' || col2);
B. Compare all columns directly with EXCEPT or row-wise equality.
C. Use FULL OUTER JOIN against snapshot table and find mismatches.
D. A is efficient but collision risk; B/C are exact.
✅ Answer: D
💡 Solution: Hash speeds detection but has collision risk; exact comparison is definitive.

Q63.

Which SQL returns the first non-null value among columns c1, c2, c3 for each row portably?
A. COALESCE(c1, c2, c3)
B. NVL(c1, NVL(c2, c3)) (Oracle)
C. CASE WHEN c1 IS NOT NULL THEN c1 WHEN c2 IS NOT NULL THEN c2 ELSE c3 END
D. A, B, or C depending on DB; COALESCE is ANSI.
✅ Answer: D
💡 Solution: COALESCE is standard; NVL Oracle-specific; CASE is portable.

Q64.

Which is true about WITH (NOLOCK) in SQL Server?
A. It allows dirty reads (reading uncommitted data) and may return inconsistent results.
B. It guarantees snapshot isolation.
C. It is ANSI standard.
D. It prevents locks being taken on tables ever.
✅ Answer: A
💡 Solution: NOLOCK hints read uncommitted rows; use snapshot isolation if consistency needed.

Q65.

Which SQL computes distinct counts across groups efficiently for large datasets?
A. Use approximate distinct (e.g., count(distinct) heavy; hyperloglog/approx_count_distinct for speed)
B. Always use COUNT(DISTINCT) exact — no approximations allowed
C. Use GROUP BY with materialized intermediate tables only
D. Use joins only
✅ Answer: A
💡 Solution: Approximate algorithms (HLL) scale well; exact COUNT(DISTINCT) can be expensive.

Q66.

Which pattern prevents deadlocks when two transactions update rows in two tables?
A. Acquire locks in same order in both transactions (e.g., always lock table A then table B).
B. Randomize order.
C. Use autocommit to avoid locks.
D. Use separate databases.
✅ Answer: A
💡 Solution: Consistent locking order reduces possibility of circular waits.

Q67.

SELECT * FROM events WHERE attrs @> '{"status":"ok"}' uses what operator in Postgres?
A. @> jsonb containment operator (true if left contains right)
B. -> arrow operator
C. #> path operator
D. ->> text extraction
✅ Answer: A
💡 Solution: @> checks if jsonb contains the specified object; efficient with GIN index.

Q68.

Which SQL gives the number of active users in last 30 days rolling window by date?
A. SELECT d::date day, COUNT(DISTINCT user_id) FROM events WHERE ts >= d - interval '29 days' GROUP BY day using generate_series for days — heavy but possible
B. Use window aggregates over daily unique counts with cumulative union — complex
C. Materialize per-day distinct sets and compute unions incrementally — scalable approach
D. All are possible; C preferred for performance at scale.
✅ Answer: D
💡 Solution: Rolling distinct counts are expensive; incremental/materialized approaches or approximations are practical at scale.

Q69.

Which of these ensures a column email is case-insensitively unique in MySQL 8 using utf8mb4?
A. Use VARCHAR with COLLATE utf8mb4_general_ci and UNIQUE constraint
B. Create functional index on LOWER(email) (MySQL does not support expression index before v8.0.13?)
C. Normalize and store lowercase emails, enforce UNIQUE on column
D. A or C depending on configuration; normalization C is portable.
✅ Answer: D
💡 Solution: Collation-based uniqueness or normalized stored lowercase with unique constraint are viable.

Q70.

employees(id, dept, salary) — get dept-wise percentile (90th) of salary. Which SQL is correct in Postgres?
A. SELECT dept, percentile_cont(0.9) WITHIN GROUP (ORDER BY salary) FROM employees GROUP BY dept;
B. SELECT dept, MAX(salary) FROM employees GROUP BY dept;
C. SELECT dept, PERCENTILE_DISC(0.9) FROM employees GROUP BY dept; — alternative discrete percentile
D. A or C depending on needed behavior (continuous vs discrete).
✅ Answer: D
💡 Solution: percentile_cont interpolates; percentile_disc picks actual value; choose per requirement.

Q71.

Which SQL returns rows with duplicate email values (all duplicates) in users?
A. SELECT u.* FROM users u JOIN (SELECT email FROM users GROUP BY email HAVING COUNT(*)>1) d ON u.email=d.email;
B. SELECT * FROM users GROUP BY email HAVING COUNT(*)>1; (invalid selects)
C. SELECT DISTINCT email FROM users HAVING COUNT(*)>1;
D. Only A is correct.
✅ Answer: A
💡 Solution: Subquery finds duplicate emails; join returns all rows with those emails.

Q72.

Which SQL generates a sequence of dates between two dates inclusive in Postgres?
A. SELECT generate_series('2024-01-01'::date,'2024-01-31'::date,'1 day')::date;
B. Use recursive CTE to iterate dates — portable alternative
C. Both A (Postgres built-in) and B (portable) valid
D. No method exists
✅ Answer: C
💡 Solution: generate_series handy in Postgres; recursive CTE works across DBs.

Q73.

Which is correct to add a column with default without rewriting huge table in Postgres 12+?
A. ALTER TABLE t ADD COLUMN x int DEFAULT 0; (fast in modern versions due to metadata-only default)
B. Previously, would cause rewrite; two-step approach may be used in older versions.
C. Both A and B depending on Postgres version.
D. Only CREATE NEW TABLE approach works.
✅ Answer: C
💡 Solution: Newer Postgres optimizes metadata-only defaults; older required rewrite.

Q74.

Which SQL returns customers whose last order date equals maximum last order date across all customers?
A. WITH last AS (SELECT cust, MAX(order_date) last_dt FROM orders GROUP BY cust) SELECT cust FROM last WHERE last_dt = (SELECT MAX(last_dt) FROM last);
B. SELECT cust FROM orders GROUP BY cust HAVING MAX(order_date) = (SELECT MAX(MAX(order_date)) FROM orders GROUP BY cust);
C. Both A and B work; A clearer.
D. None
✅ Answer: C
💡 Solution: Both yield customers with most recent last order(s).

Q75.

Which method is best to handle slowly changing dimensions type 2 (SCD2) in a data warehouse?
A. Keep history rows with effective_from, effective_to and current_flag, manage updates via ETL upsert logic.
B. Overwrite previous rows only.
C. Use only current table — no history.
D. Use triggers to automatically maintain SCD2 without ETL.
✅ Answer: A
💡 Solution: SCD2 maintains history with ranges and flags; ETL typically manages updates.

Q76.

Which SQL returns for each product the next higher priced product (lead) using window function?
A. SELECT product, price, LEAD(product) OVER (ORDER BY price) next_product FROM products; — lead gives next row’s product in order
B. SELECT product, price, MIN(p2.product) FROM products p LEFT JOIN products p2 ON p2.price > p.price GROUP BY p.product; — wrong; should use MIN(p2.price) and join back
C. A is concise and correct; B needs refinement.
D. Both invalid.
✅ Answer: C (A correct)
💡 Solution: LEAD over ORDER BY price returns next row (ties handled per ordering).

Q77.

Which is correct to safely migrate large table big_table to new schema with minimal downtime?
A. Create new table big_table_new with schema, copy rows in background in batches, keep triggers on old to mirror new, then switch via rename once caught up.
B. Drop and recreate table during maintenance window only.
C. Insert rows into new table via single transaction.
D. All approaches equal.
✅ Answer: A
💡 Solution: Blue-green style copy with triggers or logical replication minimizes downtime.

Q78.

Which SQL finds first order (earliest) per customer including the order id?
A. SELECT DISTINCT ON (cust) cust, ord_id, order_date FROM orders ORDER BY cust, order_date; — Postgres-specific DISTINCT ON
B. SELECT o.* FROM orders o WHERE order_date = (SELECT MIN(order_date) FROM orders o2 WHERE o2.cust=o.cust); — portable correlated subquery
C. Use ROW_NUMBER() OVER (PARTITION BY cust ORDER BY order_date) rn and filter rn=1 — preferred ANSI approach
D. All of the above valid depending on dialect.
✅ Answer: D
💡 Solution: Use window functions or correlated subqueries; DISTINCT ON is Postgres shortcut.

Q79.

Which statement about EXPLAIN‘s estimated row counts vs actual is true?
A. Large discrepancies indicate outdated statistics or wrong assumptions causing bad plans.
B. They always match.
C. Only actual matters; estimated ignored.
D. Estimated rows equal actual rows always in modern DBs.
✅ Answer: A
💡 Solution: If estimates are off, optimizer may pick suboptimal plan; update stats or adjust histograms.

Q80.

Which is correct to detect “users who increased monthly spend by > 50% compared to previous month”?
A. Use window LAG of monthly sums per user and compare current_sum > 1.5 * prev_sum.
B. Join aggregated current month and previous month on user and compare.
C. Both A and B valid; A concise with window functions.
D. Use triggers only.
✅ Answer: C
💡 Solution: Window or self-join of monthly aggregates both accomplish comparison.

Q81.

Which SQL ensures consistent ordering with LIMIT across runs?
A. Always specify ORDER BY with deterministic columns, e.g., ORDER BY created_at DESC, id DESC.
B. LIMIT without ORDER BY is stable.
C. ORDER BY random is deterministic.
D. Use GROUP BY instead.
✅ Answer: A
💡 Solution: LIMIT without ORDER BY yields non-deterministic set order.

Q82.

Which is true about ROW_NUMBER() vs RANK() when partition has ties?
A. ROW_NUMBER() assigns unique sequential numbers, breaking ties arbitrarily or by ORDER BY tie-breakers; RANK() assigns same rank to ties and leaves gaps.
B. Both assign same numbers.
C. RANK() and DENSE_RANK() identical.
D. ROW_NUMBER() returns duplicates on ties.
✅ Answer: A
💡 Solution: Understand ranking functions for top-N per group behavior.

Q83.

Which SQL computes percentage growth month-over-month for revenue per product?
A. Use SUM(amount) OVER (PARTITION BY product ORDER BY month) AS month_sum then LAG(month_sum) and compute ratio (month_sum - prev)/prev * 100.
B. Use correlated subquery comparing current month to previous month sums.
C. Both A and B valid; A preferred.
D. Use UNION of monthly totals only.
✅ Answer: C
💡 Solution: Window functions simplify MoM growth calculations.

Q84.

Which approach helps avoid table bloat from frequent updates in Postgres?
A. Use VACUUM regularly and consider HOT updates if possible.
B. Rebuild table daily only.
C. Disable autovacuum always.
D. Use TRUNCATE periodically.
✅ Answer: A
💡 Solution: Autovacuum and planning HOT updates reduce bloat.

Q85.

Which SQL checks referential integrity across two existing tables where FK was missing?
A. SELECT child.* FROM child LEFT JOIN parent p ON child.parent_id=p.id WHERE p.id IS NULL; to find violating rows.
B. Add FOREIGN KEY constraint directly — fails if violations exist.
C. Clean violations then add FK.
D. Steps A then C then B recommended.
✅ Answer: D
💡 Solution: Identify violations, fix data, then add constraints.

Q86.

Which SQL returns top N products by total sold across dynamic date window last 7 days?
A. SELECT pid, SUM(qty) total FROM sales WHERE sale_ts >= now()-interval '7 days' GROUP BY pid ORDER BY total DESC LIMIT 10;
B. Pre-aggregate into materialized view per day and query for last 7 days — more performant at scale.
C. Both A for simplicity and B for performance are valid.
D. None
✅ Answer: C
💡 Solution: Simple query fine for small data; pre-aggregation scales better.

Q87.

What does SELECT * FROM A CROSS JOIN B produce?
A. Cartesian product (all combinations)
B. Inner join with conditions
C. Outer join semantics
D. Error if no ON clause present
✅ Answer: A
💡 Solution: CROSS JOIN pairs every row from A with every row from B.

Q88.

Which is true about INDEX on JSONB path expressions in Postgres?
A. Use GIN index on jsonb col for containment queries (e.g., @>).
B. B-tree index supports complex JSON path queries.
C. No indexes can help JSON.
D. Use trigram indexes only.
✅ Answer: A
💡 Solution: GIN with jsonb supports efficient containment and existence checks.

Q89.

You have payments(txn_id, user_id, amount) — detect duplicate payments with same amount and same user within 10 seconds. Which SQL is efficient?
A. SELECT p1.* FROM payments p1 JOIN payments p2 ON p1.user_id=p2.user_id AND p1.txn_id<>p2.txn_id AND ABS(EXTRACT(EPOCH FROM (p1.ts-p2.ts))) <= 10 AND p1.amount=p2.amount;
B. Use window function with LAG partitioned by user and compare time difference and amount equality.
C. Both A and B valid; B typically more concise and efficient.
D. Use LIKE.
✅ Answer: C
💡 Solution: LAG avoids self-join overhead when duplicates are near each other chronologically.

Q90.

Which SQL returns rows in T with at least one non-NULL among c1,c2,c3?
A. SELECT * FROM T WHERE COALESCE(c1,c2,c3) IS NOT NULL;
B. SELECT * FROM T WHERE c1 IS NOT NULL OR c2 IS NOT NULL OR c3 IS NOT NULL;
C. Both A and B equivalent.
D. SELECT * FROM T WHERE NULLIF(c1,c2,c3) — invalid.
✅ Answer: C
💡 Solution: COALESCE returning non-null indicates at least one non-null; explicit OR is equivalent.

Q91.

Which is true about MERGE and INSERT ... ON CONFLICT semantics?
A. MERGE supports WHEN MATCHED/NOT MATCHED clauses for update/insert; ON CONFLICT (Postgres) is upsert with conflict target.
B. MERGE always faster than ON CONFLICT.
C. MERGE cannot delete.
D. ON CONFLICT not transactional.
✅ Answer: A
💡 Solution: Both provide upsert capabilities; syntax differs across DBs.

Q92.

Which SQL will return customers who have orders totaling strictly more than average order total across all customers?
A. SELECT cust FROM (SELECT cust, SUM(total) s FROM orders GROUP BY cust) t WHERE s > (SELECT AVG(s) FROM (SELECT SUM(total) s FROM orders GROUP BY cust) x);
B. SELECT cust FROM orders GROUP BY cust HAVING SUM(total) > AVG(SUM(total)) — invalid aggregate nesting
C. A is correct.
D. Both A and B correct.
✅ Answer: C
💡 Solution: Use subquery to compute per-customer sums and compare to average of those sums.

Q93.

Which is correct about TEMPORARY vs UNLOGGED tables in Postgres?
A. Temporary tables are session-scoped; unlogged tables are persistent across sessions but not WAL-logged.
B. Unlogged tables are per-session.
C. Temporary tables are always faster.
D. Unlogged tables replicate automatically.
✅ Answer: A
💡 Solution: Unlogged are not WAL-logged (faster but not crash-safe); temporary are session-local.

Q94.

Which SQL returns users who never logged in (no rows in logins table)?
A. SELECT u.* FROM users u LEFT JOIN logins l ON l.user_id=u.id WHERE l.user_id IS NULL;
B. SELECT * FROM users WHERE id NOT IN (SELECT user_id FROM logins); — watch NULLs in subquery
C. SELECT * FROM users WHERE NOT EXISTS (SELECT 1 FROM logins WHERE logins.user_id = users.id);
D. A and C safe; B may be unsafe if logins.user_id contains NULL.
✅ Answer: D
💡 Solution: LEFT JOIN+IS NULL or NOT EXISTS are robust anti-join patterns.

Q95.

Which SQL returns the top seller per month (ties broken by product id) using window functions?
A. SELECT * FROM (SELECT month, product, SUM(qty) total, ROW_NUMBER() OVER (PARTITION BY month ORDER BY SUM(qty) DESC, product ASC) rn FROM sales GROUP BY month, product) t WHERE rn=1;
B. Use RANK() instead of ROW_NUMBER() to include ties.
C. Both A and B depending on tie-handling.
D. Use DISTINCT ON only.
✅ Answer: C (A for single winner per month; B if ties allowed)
💡 Solution: ROW_NUMBER yields one winner; RANK retains ties.

Q96.

Which index supports full-text search queries to_tsvector(content) @@ to_tsquery('foo & bar') in Postgres?
A. CREATE INDEX idx_content ON docs USING GIN (to_tsvector('english', content));
B. B-tree index on content.
C. Hash index on content.
D. No index helps full-text.
✅ Answer: A
💡 Solution: GIN index on tsvector supports fast full-text searches.

Q97.

Which approach can reduce write amplification in high-update tables?
A. Use narrower rows and avoid frequent updates to same row (e.g., append-only + compaction).
B. Add many indexes (increases write amplification).
C. Use large VARCHAR columns updated frequently.
D. Always use triggers to rewrite rows.
✅ Answer: A
💡 Solution: Minimize hot updates and use append/log strategies to lessen write amplification.

Q98.

Which SQL returns all products never ordered? (products(pid), order_items(pid))
A. SELECT p.* FROM products p LEFT JOIN order_items oi ON oi.pid=p.pid WHERE oi.pid IS NULL;
B. SELECT * FROM products WHERE pid NOT IN (SELECT pid FROM order_items); — careful with NULLs in subquery result
C. SELECT * FROM products p WHERE NOT EXISTS (SELECT 1 FROM order_items oi WHERE oi.pid=p.pid);
D. A and C safe; B works if order_items.pid NOT NULL.
✅ Answer: D
💡 Solution: LEFT JOIN/IS NULL or NOT EXISTS are safe anti-join methods.

Q99.

Which SQL returns the mode (most frequent) value of column col?
A. SELECT col FROM t GROUP BY col ORDER BY COUNT(*) DESC LIMIT 1;
B. SELECT col FROM (SELECT col, ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn FROM t GROUP BY col) x WHERE rn=1; (invalid window placement)
C. SELECT col FROM t WHERE col IS NOT NULL
D. A is simplest correct.
✅ Answer: A
💡 Solution: GROUP BY with ORDER BY COUNT(*) gives most frequent; LIMIT 1 picks mode (ties require tie-handling).

Q100.

Which method most effectively enforces multi-column uniqueness but allowing one of the columns to be NULL (and treating NULLs as equal) in Postgres?
A. Create functional index replacing NULL with sentinel: CREATE UNIQUE INDEX ix ON t (COALESCE(col1, -1), COALESCE(col2, -1));
B. Use UNIQUE(col1, col2) — this allows multiple NULLs and treats NULL not equal, so not enforce-equal NULLs.
C. Use partial unique indexes per NULL combination.
D. A or C depending on desired semantics; A normalizes NULLs to sentinel to enforce equality.
✅ Answer: D
💡 Solution: COALESCE converts NULL to sentinel value enabling uniqueness; partial unique indexes are alternative.