Warm tip: This article is reproduced from stackoverflow.com, please click
postgresql sequelize.js tsvector

How to search on a single OR multiple columns with TSVECTOR and TSQUERY

发布于 2020-03-27 15:39:57

I used some boilerplate code (below) that creates a normalized tsvector _search column of all columns I specify (in searchObjects) that I'd like full-text search on.

For the most part, this is fine. I'm using this in conjunction with Sequelize, so my query looks like:

const articles = await Article.findAndCountAll({
  where: {
    [Sequelize.Op.and]: Sequelize.fn(
      'article._search @@ plainto_tsquery',
      'english',
      Sequelize.literal(':query')
    ),
    [Sequelize.Op.and]: { status: STATUS_TYPE_ACTIVE }
  },
  replacements: { query: q }
});

Search index setup:

const vectorName = '_search';

const searchObjects = {
  articles: ['headline', 'cleaned_body', 'summary'],
  brands: ['name', 'cleaned_about'],
  products: ['name', 'cleaned_description']
};

module.exports = {
  up: async queryInterface =>
    await queryInterface.sequelize.transaction(t =>
      Promise.all(
        Object.keys(searchObjects).map(table =>
          queryInterface.sequelize
            .query(
              `
          ALTER TABLE ${table} ADD COLUMN ${vectorName} TSVECTOR;
        `,
              { transaction: t }
            )
            .then(() =>
              queryInterface.sequelize.query(
                `
                UPDATE ${table} SET ${vectorName} = to_tsvector('english', ${searchObjects[
                  table
                ].join(" || ' ' || ")});
              `,
                { transaction: t }
              )
            )
            .then(() =>
              queryInterface.sequelize.query(
                `
                CREATE INDEX ${table}_search ON ${table} USING gin(${vectorName});
              `,
                { transaction: t }
              )
            )
            .then(() =>
              queryInterface.sequelize.query(
                `
                CREATE TRIGGER ${table}_vector_update
                BEFORE INSERT OR UPDATE ON ${table}
                FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(${vectorName}, 'pg_catalog.english', ${searchObjects[
                  table
                ].join(', ')});
              `,
                { transaction: t }
              )
            )
            .error(console.log)
        )
      )
    ),

  down: async queryInterface =>
    await queryInterface.sequelize.transaction(t =>
      Promise.all(
        Object.keys(searchObjects).map(table =>
          queryInterface.sequelize
            .query(
              `
          DROP TRIGGER ${table}_vector_update ON ${table};
        `,
              { transaction: t }
            )
            .then(() =>
              queryInterface.sequelize.query(
                `
                DROP INDEX ${table}_search;
              `,
                { transaction: t }
              )
            )
            .then(() =>
              queryInterface.sequelize.query(
                `
                ALTER TABLE ${table} DROP COLUMN ${vectorName};
              `,
                { transaction: t }
              )
            )
        )
      )
    )
};

The problem is that because the code concats both columns within each array of searchObjects, what is getting stored is a combined index of all columns in each array.

For example on the articles table: 'headline', 'cleaned_body', 'summary' are all part of that single generated _search vector.

Because of this, I can't really search by ONLY headline or ONLY cleaned_body, etc. I'd like to be able to search each column separately and also together.

The use case is in my search typeahead I only want to search on headline. But on my search results page, I want to search on all columns specified in searchObjects.

Can someone give me a hint on what I need to change? Should I create a new tsvector for each column?

Questioner
bob_cobb
Viewed
503
bob_cobb 2020-01-31 16:09

If anyone is curious, here's how you can create a tsvector for each column:

try {
  for (const table in searchObjects) {
    for (const col of searchObjects[table]) {
      await queryInterface.sequelize.query(
        `ALTER TABLE ${table} ADD COLUMN ${col + vectorName} TSVECTOR;`,
        { transaction }
      );
      await queryInterface.sequelize.query(
        `UPDATE ${table} SET ${col + vectorName} = to_tsvector('english', ${col});`,
        { transaction }
      );
      await queryInterface.sequelize.query(
        `CREATE INDEX ${table}_${col}_search ON ${table} USING gin(${col +
          vectorName});`,
        { transaction }
      );
      await queryInterface.sequelize.query(
        `CREATE TRIGGER ${table}_${col}_vector_update
        BEFORE INSERT OR UPDATE ON ${table}
        FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(${col +
          vectorName}, 'pg_catalog.english', ${col});`,
        { transaction }
      );
    }
  }
  await transaction.commit();
} catch (err) {
  await transaction.rollback();
  throw err;
}