Thanks for the answers, Benno! That was really helpful! (Sorry for replying after so long, I just didn't log in on here for a while)
Posts by Dan Oneață
1 month ago
1
0
0
0
2. The variable j seems to index two different things:
a. token position (in r_j^{(l)}, Sec. 3.2);
b. descriptions (in d_j and r_j, Sec. 3.1; and d_j, Sec. 3.2).
I don't know if I'm missing something obvious, or this is just ambiguous.
2 months ago
1
0
1
0
Neat work! Congratulations!
I am a bit confused by some notation from the paper:
1. In Sec. 3.1 each description is associated with a single vector, but in Sec. 3.2 it seems that each description is associated with a set of vectors (contextualized representations at each positions and layer).
2 months ago
1
0
2
0