Cross platform system for GUI automation using Machine Vision. This repository is part of the paper 'name and link' released in arXiv, and therefore it is highly advisable to first read the paper before using this repository.
Here're some of the project's best features:
- Cellular automaton inspired interactable detection algorithm
- Selenium powered interactable detector
- Crawler and analysis tool for KhanAcademy
- Backend with MongoDB (you can replace with your api-key )
- Tool for analysing OCR'ed data for page similarity matching
- Voice powered GUI navigation
- Screen and keyboard recording, analysing, and saving tools
- "Trace" creation tool (as explained in the paper)
- "Trace" replication tool (as explained in the paper)
- "Action matching" tool (as explained in the paper)
This repository uses Poetry for python package management. See poetry documentation at https://python-poetry.org/docs/.
Most of the Features of the project you can access through CLI defined in ./main.py
. Example:
python main.py hello-world
This project is licensed under the MIT
This work was a collaboration between Arnas Vyšniauskas, Iason Chaimalas for Master thesis project at UCL supervised by Dr Alejandra Beghelli and advised by Prof Gabriel Brostow. You can contact us through our project supervisor Prof Gabriel Brostow at gabriel.brostow@ucl.ac.uk