Say you have a document. You can assign tags to them, so it makes it beter discoverable. But if a document has a tag, it has the tag or it does not have the tag.

Techniques like SPLADE, doc2query, or Semantic Knowledge Graphs (SKG) can take your document and find related terms for them. But not all related terms are equal, some are more related than others.

If search keyword matches on the stronly related keyword, you want it to weight more than if it matches on a weakly related keyword.

This is possible with Elasticsearch, using nested fields and a function_score query.

Mapping

First we have to setup a mapping. Use the nested field type.

As nested properties we have a name and weight fields.

PUT nested_weights_test
{
  "mappings": {
    "properties": {
      "title": {"type": "text"},
      "tags": {
        "type": "nested",
        "properties": {
          "name": { "type": "text" },
          "weight": { "type": "float" }
        }
      }
    }
  }
}

Using a structure like {"keyword1": 2.0, "keyword 2": 1.0} would lead to mapping explosion, so that’s not a scalable solution.

Query

Add some documents.

PUT nested_weights_test/_doc/1
{
  "title": "foo bar",
  "tags": [
    {"name": "abc", "weight": 2.0},
    {"name": "xyz", "weight": 1.0}
  ]
}
PUT nested_weights_test/_doc/2
{
  "title": "colors",
  "tags": [
    {"name": "blue", "weight": 1.5},
    {"name": "red", "weight": 0.5}
  ]
}

And run the query, for two query keywords xyz and blue.

POST nested_weights_test/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "nested": {
            "path": "tags",
            "query": {
              "function_score": {
                "query": {
                  "match": {
                    "tags.name": "xyz"
                  }
                },
                "field_value_factor": {
                  "field": "tags.weight"
                }
              }
            }
          }
        },
        {
          "nested": {
            "path": "tags",
            "query": {
              "function_score": {
                "query": {
                  "match": {
                    "tags.name": "blue"
                  }
                },
                "field_value_factor": {
                  "field": "tags.weight"
                }
              }
            }
          }
        }
      ]
    }
  },
  "explain": true
}

Reponse

This is what the above query responds:

{
  "hits": {
    // ...
    "max_score": 1.8059592,
    "hits": [
      {
        "_index": "nested_weights_test",
        "_id": "2",
        "_score": 1.8059592,
        "_source": {
          "title": "colors",
          "tags": [
            {
              "name": "blue",
              "weight": 1.5
            },
            {
              "name": "red",
              "weight": 0.5
            }
          ]
        }
      },
      {
        "_index": "nested_weights_test",
        "_id": "1",
        "_score": 1.2039728,
        "_source": {
          "title": "a",
          "tags": [
            {
              "name": "abc",
              "weight": 2
            },
            {
              "name": "xyz",
              "weight": 1
            }
          ]
        }
      }
    ]
  }
}

Both the BM25 scores (the match) scores are 1.2, but the match on blue is multiplied by 1.5, so resulting in a final score of 1.8. The match on xyz is multiplied by 1, so stays 1.2.

Conclusion

With this technique with the nested field and query in combination with the function_score query we can weight the related keyword. e.g. keywords from a model like SPLADE, semantic knowledge graphs, or something else that also has a relatedness weight.