Warm tip: This article is reproduced from stackoverflow.com, please click
full-text-search postgresql

Postgres & FULL TEXT SEARCH: What is the correct SQL query to search for a phrase with multiple nega

发布于 2020-03-27 10:15:38

I have a table with a description column.

In this table, I have added an extra column called tsv that is populated with the tsvector of the description.

I'm trying to query the tsv column for a certain phrase but add negate phrases to reduce the number of false positives returned, but I'm not sure what the correct way is to do it as I get a different number of result returned for each way I am trying.

For example, take stream, as in a little river.

I get

WHERE tsv @@ to_tsquery('english', '
        stream
    ')

RETURNS: 26

But in some of the descriptions as I have checked all 26 manually, they only talk about:

...light streaming in from both the window... (There is 1 with this in)

or

...stream of natural light... (There are 2 with this in)

Which is nothing to do with a little stream of running water.

That's a total of 3, so I am expecting 23 to be returned instead.

Here's what I have tried so far, none of which return 23:

WHERE tsv @@ to_tsquery('english', '
        stream
        & ! light<->streaming
        | ! stream<2>natural<->light
    ')

>RETURNS: 261

or

WHERE tsv @@ to_tsquery('english', '
        stream
        & ! light<->streaming
        & ! stream<2>natural<->light
    ')

>RETURNS: 3

or

WHERE tsv @@ to_tsquery('english', '
        stream
        & ! ( 
            light<->streaming
            | stream<2>natural<->light
        )
    ')

>RETURNS: 8

or

WHERE tsv @@ to_tsquery('english', '
        stream
        & ( 
            ! light<->streaming
            | ! stream<2>natural<->light
        )
    ')

>RETURNS: 26

What am I doing wrong? Or is there a completely different way I need to be doing this?

Thanks in advance

ADDITION

Just so I know for sure and for my sanity, I rang this code for each term I'm trying to negate

WHERE tsv @@ to_tsquery('english',
        'light<->streaming'
     )

>RETURNS: 1
WHERE tsv @@ to_tsquery('english',
        'stream<2>natural<->light'
     )

>RETURNS: 2

Both returned the records I want removing from the main query.

Questioner
User13342
Viewed
111
User13342 2019-07-25 22:37

FOLLOW UP

I couldn't solve this doing it the way I was trying to do it e.g. all within one clause.

So to get around the issue, I create separate where clauses for each negate term.

So instead of

...  
WHERE tsv @@ to_tsquery('english',
                        'stream
                         & !(light <-> stream)
                         & !(stream <2> natural <-> light)'
                     );

I did this instead

...
WHERE tsv @@ to_tsquery('english','stream')
AND NOT WHERE tsv @@ to_tsquery('english','light <-> stream')
AND NOT WHERE tsv @@ to_tsquery('english','stream <2> natural <-> light')

Doing it this way produced the results I was expecting.

Hope this helps someone with the same problem.