Full-Text Search in Rails Using PostgreSQL
When building web applications, full-text search often becomes a crucial feature. While Elasticsearch has been a go-to solution for many developers, it’s worth exploring the robust full-text search capabilities built into PostgreSQL. This post will show you how to implement full-text search in a modern Ruby on Rails app using only PostgreSQL—no extra dependencies required—and compare its capabilities with Elasticsearch for context.
What is Full-Text Search?
Full-text search enables searching natural-language documents for terms, phrases, or boolean queries, and can even rank results by relevance. It’s more powerful than basic matching, as it handles derived words, ignores stop words (like “the” or “and”), and provides robust ranking algorithms.
For instance, full-text search allows users to search for “running” and find documents containing related words like “run” or “runner.” This magic is made possible by tokenization (splitting text into meaningful units) and normalization (e.g., stemming words to their root forms).
Why PostgreSQL Instead of Elasticsearch?
Elasticsearch is a powerful, distributed search engine, but it adds infrastructure complexity. For many applications, PostgreSQL’s full-text search is sufficient, especially when:
- Your app already uses PostgreSQL.
- Your search needs are limited to a single database instance.
- You want to minimize operational overhead.
That said, Elasticsearch shines for advanced search scenarios like:
- Horizontal scalability across nodes.
- Complex queries (e.g., fuzzy matching or advanced analytics).
- High-speed indexing for large, frequently updated datasets.
For simpler cases, PostgreSQL can do the job admirably—let’s dive into how.
Tokenizing and Searching Text in PostgreSQL
PostgreSQL provides two primary functions for full-text search:
to_tsvector
: Converts text into a searchable document (a tsvector) by tokenizing, removing stop words, and normalizing.to_tsquery
: Converts a query string into a format that can match against tsvectors.
Example: Tokenizing Text
SELECT to_tsvector('english', 'You live in a dream world. You despise the outside and you fear you are the next one');
Output:
'despise':8 'dream':5 'fear':13 'live':2 'next':17 'one':18 'outside':10 'world':6
Words like “you,” “in,” and “the” are ignored as stop words. The remaining terms are indexed for fast searching.
Example: Basic Search
SELECT to_tsvector('english', 'You live in a dream world') @@ to_tsquery('english', 'dream');
Output:
true
This query checks whether the text contains the word “dream” or its derivatives.
Boolean Queries
SELECT to_tsvector('english', 'You live in a dream world') @@ to_tsquery('english', 'dream & world');
Output:
true
You can combine terms with boolean operators like &
(AND), |
(OR), and !
(NOT).
Searching Table Data
In a Rails app, you’ll likely search across table columns. For example, consider a posts
table with title
and body
columns:
SELECT * FROM posts WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'musician');
This finds posts containing “musician,” “music,” or similar terms in either the title
or body
.
Improving Performance with Indexes
Full-text search can become slow on large datasets, but PostgreSQL’s indexes make it blazing fast. Use the GIN
index type for tsvectors:
CREATE INDEX index_posts_on_title_and_body ON posts USING gin(to_tsvector('english', title || ' ' || body));
With this index, search queries become orders of magnitude faster, even for large datasets.
Integrating Full-Text Search in Rails
While PostgreSQL’s SQL functions are powerful, Rails developers can streamline integration using gems like pg_search
and textacular
.
Using pg_search
The pg_search
gem makes PostgreSQL’s full-text search features easy to use in Rails models. Here’s an example:
class Post < ApplicationRecord
include PgSearch::Model
pg_search_scope :search_by_title_and_body,
against: %i[title body],
using: {
tsearch: { prefix: true } # Allows partial matches, e.g., "run" matches "running"
}
end
Now you can search posts like this:
Post.search_by_title_and_body('musician')
Using textacular
The textacular
gem simplifies full-text search by adding scopes to your models:
class Post < ApplicationRecord
extend Textacular
end
Post.basic_search('musician')
Ranking Results by Relevance
PostgreSQL allows ranking results using the ts_rank
function, which calculates a relevance score:
SELECT *, ts_rank(to_tsvector('english', title || ' ' || body), to_tsquery('english', 'musician')) AS rank
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'musician')
ORDER BY rank DESC;
You can incorporate this ranking into Rails queries for more relevant results.
Limitations and When to Consider Elasticsearch
While PostgreSQL’s full-text search is powerful, it has limitations:
- Scaling: It’s tied to a single database instance and doesn’t handle distributed search well.
- Advanced Features: Fuzzy matching, typo tolerance, and sophisticated analytics are better handled by Elasticsearch.
- Large Data Volumes: Elasticsearch’s inverted indexes are optimized for massive datasets.
If your app grows beyond PostgreSQL’s capabilities, you can transition to Elasticsearch or combine both systems.
Conclusion
PostgreSQL’s full-text search is a robust tool for many applications. With built-in functions, efficient indexing, and Rails-friendly gems like pg_search
, you can deliver fast, accurate search features without adding external dependencies.
By leveraging what PostgreSQL already offers, you keep your stack simple and minimize operational overhead. For most apps, this simplicity is a huge win. Give it a try in your next project—you might be surprised how far it takes you!
Resources
Share on
Twitter Facebook LinkedInHave comments or want to discuss this topic?
Send an email to ~bounga/public-inbox@lists.sr.ht