DEFINE INDEX
statement
Just like in other databases, SurrealDB uses indexes to help optimize query performance. An index can consist of one or more fields in a table and can enforce a uniqueness constraint. If you don't intend for your index to have a uniqueness constraint, then the fields you select for your index should have a high degree of cardinality, meaning that there is a high amount of diversity between the data in the indexed table records.
Requirements
- You must be authenticated as a root or namespace user before you can use the
DEFINE INDEX
statement. - You must select your namespace and database before you can use the
DEFINE INDEX
statement.
Statement syntax
SurrealQL SyntaxDEFINE INDEX [ IF NOT EXISTS ] @name ON [ TABLE ] @table [ FIELDS | COLUMNS ] @fields
[ UNIQUE | SEARCH ANALYZER @analyzer [ BM25 [(@k1, @b)] ] [ HIGHLIGHTS ] [ MTREE DIMENSION ] @dimension ] [ COMMENT @string ]
Index Types
SurrealDB offers a range of indexing capabilities designed to optimize data retrieval and search efficiency.
Unique Index
Ensures each value in the index is unique. A unique index helps enforce uniqueness across records by preventing duplicate entries in fields such as user IDs, email addresses, and other unique identifiers.
Let's create a unique index for the email address field on a user table.
-- Makes sure that the email address in the user table is always unique
DEFINE INDEX userEmailIndex ON TABLE user COLUMNS email UNIQUE;
The created index can be tested using the INFO
statement.
INFO FOR TABLE user;
The INFO
statement will help you understand what indexes are defined in your TABLE
.
{
"events": {},
"fields": {},
"indexes": {
"userEmailIndex": "DEFINE INDEX userEmailIndex ON user FIELDS email UNIQUE"
},
"lives": {},
"tables": {}
}
As we defined a UNIQUE
index on the email
column, a duplicate entry for that column or field will throw an error.
-- Create a user record and set an email ID.
CREATE user:1 SET email = 'test@surrealdb.com';
Response[
{
"email": "test@surrealdb.com",
"id": "user:1"
}
]
Creating another record with the same email ID will throw an error.
-- Create another user record and set the same email ID.
CREATE user:2 SET email = 'test@surrealdb.com';
ResponseDatabase index `userEmailIndex` already contains 'test@surrealdb.com',
with record `user:1`
To set the same email for user:2
, the original record must be deleted
DELETE user:1;
CREATE user:2 SET email = 'test@surrealdb.com'
[
{
"email": "test@surrealdb.com",
"id": "user:2"
}
]
Non-Unique Index
This allows for the indexing of attributes that may have non-unique values, facilitating efficient data retrieval. Non-unique indexes help index frequently appearing data in queries that do not require uniqueness, such as categorization tags or status indicators.
Let's create a non-unique index for an age field on a user table.
-- optimise queries looking for users of a given age
DEFINE INDEX userAgeIndex ON TABLE user COLUMNS age;
Composite Index
The composite index spans multiple fields and columns of a table. Similar to a single field index, composite indexes can also be UNIQUE
-- Create an index on the account and email fields of the user table
DEFINE INDEX test ON user FIELDS account, email;
Full-Text Search Index
Enables efficient searching through textual data, supporting advanced text-matching features like proximity searches and keyword highlighting.
The Full-Text search index helps implement comprehensive search functionalities in applications, such as searching through articles, product descriptions, and user-generated content.
Let's create a full-text search index for a name
field on a user
table.
-- Allow full-text search queries on the name of the user
DEFINE INDEX userNameIndex ON TABLE user COLUMNS name SEARCH ANALYZER ascii BM25 HIGHLIGHTS;
SEARCH
: By using theSEARCH
keyword, you enable full-text search on the specified column.ANALYZER ascii
: Uses a custom analyzer calledascii
for tokenizing and analyzing the text.BM25
: Ranking algorithm used for relevance scoring.HIGHLIGHTS
: Allows keyword highlighting in search results output when using thesearch::highlight
function
Vector Search Indexes
Vector search indexes in SurrealDB support efficient k-nearest neighbors (kNN) and Approximate Nearest Neighbor (ANN) operations, which are pivotal in performing similarity searches within complex, high-dimensional datasets and data types.
M-Tree index
The M-Tree index is suitable for the task of finding exact nearest neighbors based on a distance metric (like Euclidean distance). Mtree currently supports Euclidean, Cosine, Manhattan and Minkowski distance functions. In SurrealDB, you can also use an M-Tree based index to perform the search.
Note: When no function is specified it chooses Euclidean for the default distance function.
For example, consider a record with 4 dimensional vectors.
CREATE pts:1 SET point = [1,2,3,4];
CREATE pts:2 SET point = [4,5,6,7];
CREATE pts:3 SET point = [8,9,10,11];
To define an M-tree index on the point
field of the pts
table, you can use the query below:
DEFINE INDEX mt_pt ON pts FIELDS point MTREE DIMENSION 4;
In the above example, the MTREE DIMENSION
clause specifies the dimension of the vector. The point
field is a 4-dimensional vector array, so we set the dimension to 4
. If this is sucessfully created, you can use the index to perform vector searches to find records based on the distance from a given point.
In the example below, the query searches for points closest to the vector [2,3,4,5]
and uses vector functions to calculate the distance between two points, indicated by <|2|>
.
LET $pt = [2,3,4,5];
SELECT id, vector::distance::euclidean(point, $pt) AS dist FROM pts WHERE point <|2|> $pt
Output[
{
"dist": 2,
"id": "pts:1"
},
{
"dist": 4,
"id": "pts:2"
}
]
Verifying Index Utilization in Queries
The EXPLAIN
clause from SurrealQL helps you understand the execution plan of the query and provides transparency around index utilization.
SELECT * FROM user WHERE email='test@surrealdb.com' EXPLAIN FULL;
It also reveals details about which operation
was used by the query planner and how many records matched the search criteria.
[
{
"detail": {
"plan": {
"index": "userEmailIndex",
"operator": "=",
"value": "test@surrealdb.com"
},
"table": "user"
},
"operation": "Iterate Index"
},
{
"detail": {
"count": 1
},
"operation": "Fetch"
}
]
Using IF NOT EXISTS
clause Since 1.3.0
The IF NOT EXISTS
clause can be used to define a index only if it does not already exist. If the index already exists, the DEFINE INDEX
statement will return an error.
-- Create a INDEX if it does not already exist
DEFINE INDEX IF NOT EXISTS example ON example FIELDS example;
Performance Implications
When defining indexes, it's essential to consider the fields most frequently queried or used to optimize performance.
Indexes may improve the performance of SurrealQL statements. This may not be noticeable with small tables but it can be significant for large tables; especially when the indexed fields are used in the WHERE
clause of a SELECT
statement.
Indexes can also impact the performance of write operations (INSERT, UPDATE, DELETE) since the index needs to be updated accordingly. Therefore, it's essential to balance the need for read performance with write performance.