SOQL is the backbone of data access in Salesforce, but poorly written queries can quickly exhaust your governor limits and degrade performance at scale. Understanding how Salesforce processes queries and optimizes storage is essential for building production-ready applications.
Selective Queries
A query is considered selective when it uses indexed fields in the WHERE clause and returns a small percentage of total records. Non-selective queries perform full table scans and will fail on large orgs (typically 200,000+ records).
Selectivity Rule: A query is selective if it returns fewer than 10% of total records (up to 333,333) OR fewer than 200,000 records from a non-standard object. Always filter on indexed fields.
// ❌ Non-selective — scans every row
SELECT Id, Name, Email
FROM Contact
WHERE Description = 'VIP Customer'
// ✅ Selective — filters on standard indexed field
SELECT Id, Name, Email
FROM Contact
WHERE AccountId = '001XXXXXXXXXXXXXXX'
AND IsDeleted = false
// ✅ Selective — using external ID custom index
SELECT Id, Name
FROM Account
WHERE External_Id__c = 'EXT-12345'
Indexing Strategy
Salesforce automatically indexes certain fields. For custom fields, you can request custom indexes through Salesforce Support, or mark them as External ID or Unique to gain index benefits.
Auto-Indexed Fields
Id, Name, OwnerId, CreatedDate, SystemModstamp, RecordTypeId, and all master-detail/lookup foreign keys are indexed by default.
Custom Index Fields
Fields marked as External ID or Unique automatically get an index. For other critical filter fields, request a custom index from Salesforce Support.
Index Exclusions: Indexes are NOT used when filtering on NULL values (field = null), when using negative operators (!=, NOT IN, NOT LIKE), or when a formula field references a non-indexed field.
Relationship Queries
SOQL supports two types of relationship queries. Use them to reduce the number of total queries and avoid hitting the 100 SOQL limit.
// Fetch Accounts with their related Contacts in one query
SELECT
Id,
Name,
AnnualRevenue,
(SELECT Id, FirstName, LastName, Email
FROM Contacts
WHERE IsDeleted = false
ORDER BY LastName ASC
LIMIT 200)
FROM Account
WHERE Industry = 'Technology'
AND AnnualRevenue > 1000000
// Traverse up to 5 levels of parent relationships
SELECT
Id,
Subject,
Account.Name,
Account.Owner.Name,
Account.Owner.Profile.Name
FROM Case
WHERE Status = 'Open'
AND Account.Industry = 'Healthcare'
// Access sub-query results in Apex
for (Account acc : [SELECT Id, Name, (SELECT Id, Email FROM Contacts) FROM Account]) {
List<Contact> contacts = acc.Contacts;
System.debug(acc.Name + ' has ' + contacts.size() + ' contacts');
}
Aggregate Functions
Use aggregate SOQL to perform calculations database-side instead of pulling thousands of records into Apex heap memory.
// ❌ Inefficient — loads all records into heap
Decimal total = 0;
for (Opportunity opp : [SELECT Amount FROM Opportunity WHERE StageName = 'Closed Won']) {
total += opp.Amount;
}
// ✅ Efficient — calculates at the database level
AggregateResult[] results = [
SELECT
StageName,
SUM(Amount) totalAmount,
COUNT(Id) dealCount,
AVG(Amount) avgDeal,
MAX(Amount) biggestDeal
FROM Opportunity
WHERE CloseDate = THIS_YEAR
GROUP BY StageName
HAVING SUM(Amount) > 50000
ORDER BY SUM(Amount) DESC
];
for (AggregateResult ar : results) {
System.debug(ar.get('StageName') + ': $' + ar.get('totalAmount'));
}
Large Data Volumes
When working with millions of records, standard queries fail. Use LIMIT/OFFSET, last-modified-date filters, or SOQL for loops to process records in batches.
SOQL For Loop: Automatically chunks results into 200-record batches, keeping heap usage low. Always prefer this over storing all query results in a List when processing large datasets.
// ❌ Loads entire result set into memory — dangerous at scale
List<Contact> allContacts = [
SELECT Id, Email FROM Contact WHERE AccountId != null
];
// ✅ SOQL for loop — processes 200 records at a time
for (List<Contact> batch : [
SELECT Id, Email, AccountId
FROM Contact
WHERE AccountId != null
ORDER BY SystemModstamp ASC
]) {
// Each 'batch' is a List<Contact> of up to 200 records
List<Contact> toUpdate = new List<Contact>();
for (Contact c : batch) {
if (c.Email == null) {
c.Email = c.AccountId + '@placeholder.com';
toUpdate.add(c);
}
}
if (!toUpdate.isEmpty()) {
update toUpdate;
}
}
// Sync only recently modified records instead of everything
SELECT Id, Name, Phone, BillingCity
FROM Account
WHERE SystemModstamp > LAST_N_HOURS:24
ORDER BY SystemModstamp ASC
LIMIT 2000
// Common date literals
// TODAY, YESTERDAY, THIS_WEEK, LAST_WEEK
// THIS_MONTH, LAST_MONTH, THIS_YEAR, LAST_YEAR
// LAST_N_DAYS:n, NEXT_N_DAYS:n
// LAST_N_WEEKS:n, LAST_N_HOURS:n
Using the Query Plan Tool
The Query Plan tool in the Developer Console helps you understand how Salesforce will execute your query before running it in production. Enable it via Help > Preferences > Enable Query Plan.