Reinforcement-Learning-

Conversational Chatbot using RL in python

implement an off-policy method in which the behavioral policy is a greedy approach

used LTSM(seq2seq) and maximum likelihood function

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
main.py		main.py