PostgreSQL Full Text Search: why search is sooo slow? -
i have small postgresql database (~~3,000 rows).
i'm trying set full text search on 1 of it's text fields ('body').
the problem query extremely slow (35+ seconds!!!).
i suppose problem comes fact db chooses sequential scan mode...
this query:
select ts_rank_cd(to_tsvector('italian', body), query), ts_headline('italian', body, to_tsquery('torino')), title, location, id_author fulltextsearch.documents, to_tsquery('torino') query (body_tsvector @@ query) offset 0
this explain analyze:
query plan ---------------------------------------------------------------------------------------------------------------------------- limit (cost=0.00..1129.81 rows=19 width=468) (actual time=74.059..13630.114 rows=863 loops=1) -> nested loop (cost=0.00..1129.81 rows=19 width=468) (actual time=74.056..13629.342 rows=863 loops=1) join filter: (documents.body_tsvector @@ query.query) -> function scan on to_tsquery query (cost=0.00..0.01 rows=1 width=32) (actual time=4.606..4.608 rows=1 loops=1) -> seq scan on documents (cost=0.00..1082.09 rows=3809 width=591) (actual time=0.045..48.072 rows=3809 loops=1) total runtime: 13630.720 ms
this table:
mydb=# \d+ fulltextsearch.documents; table "fulltextsearch.documents" column | type | modifiers | storage | description ---------------+-------------------+-----------------------------------------------------------------------+----------+------------- id | integer | not null default nextval('fulltextsearch.documents_id_seq'::regclass) | plain | id_author | integer | | plain | body | character varying | | extended | title | character varying | | extended | location | character varying | | extended | date_creation | date | | plain | body_tsvector | tsvector | | extended | indexes: "fulltextsearch_documents_tsvector_idx" gin (to_tsvector('italian'::regconfig, coalesce(body, ''::character varying)::text)) "id_idx" btree (id) triggers: body_tsvectorupdate before insert or update on fulltextsearch.documents each row execute procedure tsvector_update_trigger('body_tsvector', 'pg_catalog.italian', 'body') has oids: no
i'm sure i'm missing obvious....
any clues?
.
.
.
=== update =======================================================================
thanks suggestions, came (better) query:
select ts_rank(body_tsvector, query), ts_headline('italian', body, query), title, location fulltextsearch.documents, to_tsquery('italian', 'torino') query to_tsvector('italian', coalesce(body,'')) @@ query
which quite better, slow (13+ seconds...).
i notice commenting out "ts_headline()" row query lightning-fast.
this explain analyze, uses index, doesn't me much...:
explain analyze select clock_timestamp() - statement_timestamp() elapsed_time, ts_rank(body_tsvector, query), ts_headline('italian', body, query), title, location fulltextsearch.documents, to_tsquery('italian', 'torino') query to_tsvector('italian', coalesce(body,'')) @@ query nested loop (cost=16.15..85.04 rows=19 width=605) (actual time=102.290..13392.161 rows=863 loops=1) -> function scan on query (cost=0.00..0.01 rows=1 width=32) (actual time=0.008..0.009 rows=1 loops=1) -> bitmap heap scan on documents (cost=16.15..84.65 rows=19 width=573) (actual time=0.381..4.236 rows=863 loops=1) recheck cond: (to_tsvector('italian'::regconfig, (coalesce(body, ''::character varying))::text) @@ query.query) -> bitmap index scan on fulltextsearch_documents_tsvector_idx (cost=0.00..16.15 rows=19 width=0) (actual time=0.312..0.312 rows=863 loops=1) index cond: (to_tsvector('italian'::regconfig, (coalesce(body, ''::character varying))::text) @@ query.query) total runtime: 13392.717 ms
you're missing 2 (reasonably obvious) things:
1 you've set 'italian'
in to_tsvector()
aren't specifying in to_tsquery()
keep both consistent.
2 you've indexed coalesce(body, ...)
isn't you're searching against.
the planner isn't magic - can use index if that's you're searching against.
Comments
Post a Comment