Full-Text Search in Rails Using PostgreSQL

When building web applications, full-text search often becomes a crucial feature. While Elasticsearch has been a go-to solution for many developers, it’s worth exploring the robust full-text search capabilities built into PostgreSQL. This post will show you how to implement full-text search in a modern Ruby on Rails app using only PostgreSQL—no extra dependencies required—and compare its capabilities with Elasticsearch for context.

Full-text search enables searching natural-language documents for terms, phrases, or boolean queries, and can even rank results by relevance. It’s more powerful than basic matching, as it handles derived words, ignores stop words (like “the” or “and”), and provides robust ranking algorithms.

For instance, full-text search allows users to search for “running” and find documents containing related words like “run” or “runner.” This magic is made possible by tokenization (splitting text into meaningful units) and normalization (e.g., stemming words to their root forms).

Why PostgreSQL Instead of Elasticsearch?

Elasticsearch is a powerful, distributed search engine, but it adds infrastructure complexity. For many applications, PostgreSQL’s full-text search is sufficient, especially when:

  • Your app already uses PostgreSQL.
  • Your search needs are limited to a single database instance.
  • You want to minimize operational overhead.

That said, Elasticsearch shines for advanced search scenarios like:

  • Horizontal scalability across nodes.
  • Complex queries (e.g., fuzzy matching or advanced analytics).
  • High-speed indexing for large, frequently updated datasets.

For simpler cases, PostgreSQL can do the job admirably—let’s dive into how.

Tokenizing and Searching Text in PostgreSQL

PostgreSQL provides two primary functions for full-text search:

  • to_tsvector: Converts text into a searchable document (a tsvector) by tokenizing, removing stop words, and normalizing.
  • to_tsquery: Converts a query string into a format that can match against tsvectors.

Example: Tokenizing Text

SELECT to_tsvector('english', 'You live in a dream world. You despise the outside and you fear you are the next one');

Output:

'despise':8 'dream':5 'fear':13 'live':2 'next':17 'one':18 'outside':10 'world':6

Words like “you,” “in,” and “the” are ignored as stop words. The remaining terms are indexed for fast searching.

SELECT to_tsvector('english', 'You live in a dream world') @@ to_tsquery('english', 'dream');

Output:

 true

This query checks whether the text contains the word “dream” or its derivatives.

Boolean Queries

SELECT to_tsvector('english', 'You live in a dream world') @@ to_tsquery('english', 'dream & world');

Output:

 true

You can combine terms with boolean operators like & (AND), | (OR), and ! (NOT).

Searching Table Data

In a Rails app, you’ll likely search across table columns. For example, consider a posts table with title and body columns:

SELECT * FROM posts WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'musician');

This finds posts containing “musician,” “music,” or similar terms in either the title or body.

Improving Performance with Indexes

Full-text search can become slow on large datasets, but PostgreSQL’s indexes make it blazing fast. Use the GIN index type for tsvectors:

CREATE INDEX index_posts_on_title_and_body ON posts USING gin(to_tsvector('english', title || ' ' || body));

With this index, search queries become orders of magnitude faster, even for large datasets.

Integrating Full-Text Search in Rails

While PostgreSQL’s SQL functions are powerful, Rails developers can streamline integration using gems like pg_search and textacular.

The pg_search gem makes PostgreSQL’s full-text search features easy to use in Rails models. Here’s an example:

class Post < ApplicationRecord
  include PgSearch::Model

  pg_search_scope :search_by_title_and_body,
                  against: %i[title body],
                  using: {
                    tsearch: { prefix: true } # Allows partial matches, e.g., "run" matches "running"
                  }
end

Now you can search posts like this:

Post.search_by_title_and_body('musician')

Using textacular

The textacular gem simplifies full-text search by adding scopes to your models:

class Post < ApplicationRecord
  extend Textacular
end

Post.basic_search('musician')

Ranking Results by Relevance

PostgreSQL allows ranking results using the ts_rank function, which calculates a relevance score:

SELECT *, ts_rank(to_tsvector('english', title || ' ' || body), to_tsquery('english', 'musician')) AS rank
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'musician')
ORDER BY rank DESC;

You can incorporate this ranking into Rails queries for more relevant results.

Limitations and When to Consider Elasticsearch

While PostgreSQL’s full-text search is powerful, it has limitations:

  • Scaling: It’s tied to a single database instance and doesn’t handle distributed search well.
  • Advanced Features: Fuzzy matching, typo tolerance, and sophisticated analytics are better handled by Elasticsearch.
  • Large Data Volumes: Elasticsearch’s inverted indexes are optimized for massive datasets.

If your app grows beyond PostgreSQL’s capabilities, you can transition to Elasticsearch or combine both systems.

Conclusion

PostgreSQL’s full-text search is a robust tool for many applications. With built-in functions, efficient indexing, and Rails-friendly gems like pg_search, you can deliver fast, accurate search features without adding external dependencies.

By leveraging what PostgreSQL already offers, you keep your stack simple and minimize operational overhead. For most apps, this simplicity is a huge win. Give it a try in your next project—you might be surprised how far it takes you!

Resources

Have comments or want to discuss this topic?

Send an email to ~bounga/public-inbox@lists.sr.ht