"train the query encoder and doc encoder to be closer in their embedding space using click data" <- Any papers/resources you know where I can learn more about this process?
Triplet loss takes an anchor, positive, and negative. In this case the anchor is your query, the positive is a similar doc, and the negative is a dissimilar doc. When you train, backpropagate the loss to both the doc and the query encoder.