upload code

9a3a4aa1 · Cao Duc Anh · ba8dab45 · 9a3a4aa1 · 9a3a4aa1 · 9a3a4aa1
Commit 9a3a4aa1 authored Jul 23, 2024 by Cao Duc Anh
20 changed files
--- a/.gitignore
+++ b/.gitignore
+weights/transformerocr.pth
\ No newline at end of file
--- a/README.md
+++ b/README.md
 # VIVAS Viet OCR
+## Training
+### 1. Prepare data
+Bổ sung dữ liệu vào thư mục data theo quy tắc sau: <br>
+*(Xem mẫu trong thư mục data)* <br>
+- Hình ảnh được lưu trong "data/images".<br>
+- Nhãn được ghi trong file train_annotation.txt (dữ liệu training) và file test_annotation.txt (dữ liệu valid) chia tỷ lệ 8-2 với định dạng mỗi dòng là: "path_to_image_file   label" (cách nhau bằng dấu tab). <br>
+**path_to_image_file**: đường dẫn tương đối lấy gốc từ thư mục data. <br>
+**label**: dòng chữ có trong ảnh. <br>
+Vd: 
+```
+images/20140603_0003_BCCTC_tg_0_0.png	Bản chất của thành công
+images/20140603_0003_BCCTC_tg_1.png     Đã bao giờ bạn tự hỏi thành công là gì mà bao kẻ bỏ cả cuộc đời mình theo đuổi? Phải
+images/20140603_0003_BCCTC_tg_12.png	chăng đó là kết quả hoàn hảo trong công việc, sự chính xác đến từng chi tiết? Hay đó
+```
+### 2. Config
+Chỉnh sửa file config.yml nếu cần. <br>
+Các thông tin cần lưu ý:
+```
+dataset:
+  image_height: 32
+  image_max_width: 512
+  image_min_width: 32
+device: cuda:0
+trainer:
+  batch_size: 8
+  iters: 20000
+  metrics: 1000
+  print_every: 20
+  valid_every: 20
+```
+metrics: số mẫu được dùng để valid
+### 3. Run training
+```
+docker-compose -f training.docker-compose.yml up --build
+```
+Theo dõi kết quả trên màn hình terminal. <br>
+Model sau khi train được lưu tại **"./weights"**
\ No newline at end of file
--- a/config.yml
+++ b/config.yml
+aug:
+  image_aug: true
+  masked_language_model: true
+backbone: vgg19_bn
+cnn:
+  hidden: 256
+  ks:
+  - - 2
+    - 2
+  - - 2
+    - 2
+  - - 2
+    - 1
+  - - 2
+    - 1
+  - - 1
+    - 1
+  pretrained: true
+  ss:
+  - - 2
+    - 2
+  - - 2
+    - 2
+  - - 2
+    - 1
+  - - 2
+    - 1
+  - - 1
+    - 1
+dataloader:
+  num_workers: 3
+  pin_memory: true
+dataset:
+  data_root: ./data
+  image_height: 32
+  image_max_width: 512
+  image_min_width: 32
+  name: hw
+  train_annotation: train_annotation.txt
+  valid_annotation: test_annotation.txt
+device: cuda:0
+optimizer:
+  max_lr: 0.0003
+  pct_start: 0.1
+predictor:
+  beamsearch: false
+pretrain: https://vocr.vn/data/vietocr/vgg_transformer.pth
+quiet: false
+seq_modeling: transformer
+trainer:
+  batch_size: 1
+  checkpoint: ./checkpoint/transformerocr_checkpoint.pth
+  export: ./weights/transformerocr.pth
+  iters: 20000
+  log: ./train.log
+  metrics: 2
+  print_every: 20
+  valid_every: 20
+transformer:
+  d_model: 256
+  dim_feedforward: 2048
+  max_seq_length: 1024
+  nhead: 8
+  num_decoder_layers: 6
+  num_encoder_layers: 6
+  pos_dropout: 0.1
+  trans_dropout: 0.1
+vocab: 'aAàÀảẢãÃáÁạẠăĂằẰẳẲẵẴắẮặẶâÂầẦẩẨẫẪấẤậẬbBcCdDđĐeEèÈẻẺẽẼéÉẹẸêÊềỀểỂễỄếẾệỆfFgGhHiIìÌỉỈĩĨíÍịỊjJkKlLmMnNoOòÒỏỎõÕóÓọỌôÔồỒổỔỗỖốỐộỘơƠờỜởỞỡỠớỚợỢpPqQrRsStTuUùÙủỦũŨúÚụỤưƯừỪửỬữỮứỨựỰvVwWxXyYỳỲỷỶỹỸýÝỵỴzZ0123456789!"#$%&''()*+,-./:;<=>?@[\]^_`{|}~ '
+weights: https://vocr.vn/data/vietocr/vgg_transformer.pth
--- a/data/images/20140603_0003_BCCTC_tg_0_0.png
+++ b/data/images/20140603_0003_BCCTC_tg_0_0.png
--- a/data/images/20140603_0003_BCCTC_tg_1.png
+++ b/data/images/20140603_0003_BCCTC_tg_1.png
--- a/data/images/20140603_0003_BCCTC_tg_12.png
+++ b/data/images/20140603_0003_BCCTC_tg_12.png
--- a/data/images/20151126_0055_25480_tg_0_5.png
+++ b/data/images/20151126_0055_25480_tg_0_5.png
--- a/data/images/20151208_0064_26558_1_tg_4_3.png
+++ b/data/images/20151208_0064_26558_1_tg_4_3.png
--- a/data/images/20151209_0144_7398_2_tg_1_3.png
+++ b/data/images/20151209_0144_7398_2_tg_1_3.png
--- a/data/images/20160511_0127_9539_2_tg_4_1.png
+++ b/data/images/20160511_0127_9539_2_tg_4_1.png
--- a/data/images/20160523_0165_25464_1_tg_0_1.png
+++ b/data/images/20160523_0165_25464_1_tg_0_1.png
--- a/data/images/20160524_0166_9415_2_tg_0_3.png
+++ b/data/images/20160524_0166_9415_2_tg_0_3.png
--- a/data/images/20160527_0174_25300_tg_1_1.png
+++ b/data/images/20160527_0174_25300_tg_1_1.png
--- a/data/test_annotation.txt
+++ b/data/test_annotation.txt
+images/20140603_0003_BCCTC_tg_0_0.png	Bản chất của thành công
+images/20160527_0174_25300_tg_1_1.png	phó đại diện thường trú của Chương trình Phát triển Liên Hiệp Quốc (UNDP)
\ No newline at end of file
--- a/data/train_annotation.txt
+++ b/data/train_annotation.txt
+images/20140603_0003_BCCTC_tg_1.png	Đã bao giờ bạn tự hỏi thành công là gì mà bao kẻ bỏ cả cuộc đời mình theo đuổi? Phải
+images/20140603_0003_BCCTC_tg_12.png	chăng đó là kết quả hoàn hảo trong công việc, sự chính xác đến từng chi tiết? Hay đó
+images/20151126_0055_25480_tg_0_5.png	niên tên Hà đến gửi 200.000 đ.
+images/20160524_0166_9415_2_tg_0_3.png	thẻ sinh viên của một cô gái, hết hạn năm... 2001. Mọi người cùng cười : " Xui quá hả? Thôi,
+images/20160511_0127_9539_2_tg_4_1.png	xứ này không phải lo khi chết đi để khổ cho con cháu chuyện chôn cất nữa.
+images/20151209_0144_7398_2_tg_1_3.png	xét nghiệm HIV cho kết quả dương tính. Chúng tôi chưa kịp làm được
+images/20151208_0064_26558_1_tg_4_3.png	nước sạch bằng nửa lượng nước hiện có, và đề nghị TP triển khai tiếp dự án
+images/20160523_0165_25464_1_tg_0_1.png	cảm " với dân. " Lần đầu tiên trong một nghị định, những hành vi vi phạm kèm hình
\ No newline at end of file
--- a/requirements.txt
+++ b/requirements.txt
+# Base on pytorch/pytorch:2.3.1-cuda11.8-cudnn8-runtime
+vietocr
+albumentations
\ No newline at end of file
--- a/training.Dockerfile
+++ b/training.Dockerfile
+FROM pytorch/pytorch:2.3.1-cuda11.8-cudnn8-runtime
+ENV DEBIAN_FRONTEND=noninteractive
+ENV PYTHONUNBUFFERED=True \
+    PORT=9090
+# Install dependencies
+RUN apt-get update \
+    && apt-get install -y wget libgl1-mesa-glx libglib2.0-0
+WORKDIR /src
+COPY ./requirements.txt /src/requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+COPY ./*.py /src/
\ No newline at end of file
--- a/training.docker-compose.yml
+++ b/training.docker-compose.yml
+version: '3.9' 
+services:
+  training-vietocr:
+    build:
+      context: ./
+      dockerfile: training.Dockerfile 
+    volumes:
+      - ./data:/src/data/
+      - ./config.yml:/src/config.yml
+      - ./weights:/src/weights/
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              device_ids: ['0']
+              capabilities: [gpu]
+    command: python training.py --config config.yml
+    # if training with checkpoint: python training.py --config config.yml --checkpoint path_to_checkpoint
\ No newline at end of file
--- a/training.py
+++ b/training.py
+import argparse
+from vietocr.model.trainer import Trainer
+from vietocr.tool.config import Cfg
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--config', required=True, help='see example at ')
+    parser.add_argument('--checkpoint', required=False, help='your checkpoint')
+    args = parser.parse_args()
+    config = Cfg.load_config_from_file(args.config)
+    trainer = Trainer(config, pretrained=True)
+    if args.checkpoint:
+        trainer.load_checkpoint(args.checkpoint)
+    trainer.train()
+if __name__ == '__main__':
+    main()
--- a/weights/readme.txt
+++ b/weights/readme.txt
+weight trained save in here
\ No newline at end of file