Structure of project:
GCN_Vietnamese
|__Code
| |__aligment (preprocess raw image folder)
| | |__angle_mlp.py (main)
| |__detect_word
| | |__inference.py
| | |__perspective.py
| |__extract_label (explore raw label file into small file )
| | |
| | |__extract_label.py
| |__gcn (create instance of Graph and insists of inference file)
| | |__use
| | |__graph.py (class of Graph)
| | |__prepare (create instance and preprocessing data)
| | |__train.py
| | |__model.py
| | |__gen_dataVN.py
| | |__test_single.py
| | |__inference.py
| |__material (prepare data file for train GCN)
| | |__test_data
| | |__train_data
| |__ocr (material for read text)
| | |__ocr.py
| |__U2Net (Segmentation background)
| | |__u2net_test.py
| |__Vietnam_invoice_data
| |__mcocr2021_raw (raw data with background - not use) - 1934 images all
| | |__mcocr_train_data
| | |__mcocr_val_data
| | |__test
| |__preprocessed_data (preprocess data (rotated, delete back) use this) - 1090 images
| | |__images (.jpg)
| | |__label_mcocr2021 (.csv)
|__combineLine.py
|__fullGCN.py
File main is fullGCN, This file combination of process have 5 parts.
- Detect word with CRAFT call to detect
- Combination the output of CRAFT into each sentences (cause the sentence detect mode of CRAFT is not accurate despite of adjust param so I write a combine scripts in combineLine)
- OCR with Vietocr (create model in main file and pass it into end2end function to read sentences after combine line)
- Embedding graph with Graph instance (create feature, embedding sentence by vietnamese-sbert)
- Inference with GCN and save image output.
- 1: U2net Segmentation
- 2: Text Detection CRAFT
- 3: Combination into sentence
- 4: GCN