I was just looking for something like this yesterday! Do you know how this compares to generate potential questions from your documents in advance, then doing similarity search against the question?
It sounds like you need to use a specialized model (contriever) to generate the embedding of the hypothetical document. Can you not just use the same model that was used to generate the embeddings for the chunks of the original data?
I was just looking for something like this yesterday! Do you know how this compares to generate potential questions from your documents in advance, then doing similarity search against the question?
It is pretty much analogical. you can do both :)
It sounds like you need to use a specialized model (contriever) to generate the embedding of the hypothetical document. Can you not just use the same model that was used to generate the embeddings for the chunks of the original data?
Sure, you must always use the same embedding model for all texts that are utilized within the embedding space.