Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

85. Bi-directional RNN and Multi-layer RNN #85

Open
neutron0831 opened this issue Feb 14, 2023 · 0 comments
Open

85. Bi-directional RNN and Multi-layer RNN #85

neutron0831 opened this issue Feb 14, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@neutron0831
Copy link
Owner

neutron0831 commented Feb 14, 2023

85. Bi-directional RNN and Multi-layer RNN

Encode the input text using both forward and backward RNNs and train the model.

$$
\overleftarrow{h}_{T+1} = 0, \
\overleftarrow{h}t = {\rm \overleftarrow{RNN}}(\mathrm{emb}(x_t), \overleftarrow{h}{t+1}), \
y = {\rm softmax}(W^{(yh)} [\overrightarrow{h}_T; \overleftarrow{h}_1] + b^{(y)})
$$

However,$\overrightarrow{h}_t \in \mathbb{R}^{d_h}, \overleftarrow{h}_t \in \mathbb{R}^{d_h}$ is the hidden state vector at time $t$ obtained by the forward and backward RNNs, and ${\rm \overleftarrow{RNN}}(x,h)$ is the RNN unit that calculates the previous state from the input $x$ and the hidden state $h$ at the next time, $W^{(yh)} \in \mathbb{R}^{L \times 2d_h}$ is a matrix for predicting categories from the hidden state vector, and $b^{(y)} \in \mathbb{R}^{L}$ is the bias term. Moreover,$[a; b]$ represents a concatenation of two vectors $a$ and $b$.

In addition, experiment with multi-layered bidirectional RNNs.

@neutron0831 neutron0831 added the enhancement New feature or request label Feb 14, 2023
@neutron0831 neutron0831 added this to the Chapter 9: RNN and CNN milestone Feb 14, 2023
@neutron0831 neutron0831 self-assigned this Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant