Commit 63ec8d74 authored by Cao Duc Anh's avatar Cao Duc Anh

add training tutorial in readme

parent 2c7869af
......@@ -31,6 +31,27 @@ ViHOS chứa 26.476 khoảng thời gian được chú thích bởi con người
}
```
## Hướng dẫn huấn luyện model
1. Chuẩn bị dữ liệu dạng csv gồm 2 cột "text" và "label". Upload các file tài liệu lên MinIO (cổng 9091 máy chạy docker compose) bucket "data-annotated".
2. Gọi API training:
```
curl --location '10.3.2.100:8000/start-training' \
--header 'Content-Type: application/json' \
--data '{
"pretrain": ""
}'
```
Cấu hình mặc định trong file config.yml:
```
training:
epoch: 200
batch_size: 8
load_data_worker: 2
k_fold: 5
test_ratio: 0.1
```
3. Theo dõi thông số quá trình huấn luyện tại màn hình tensorboard: http://0.0.0.0:6006/
## Hướng dẫn triển khai với registry vivas
Yêu cầu cấu hình tối thiểu:
- CPU intel core i5
......
......@@ -23,7 +23,7 @@ phobert_base:
max_token_length: 256
training:
epoch: 100
epoch: 200
batch_size: 8
load_data_worker: 2
k_fold: 5
......
......@@ -2,7 +2,7 @@ version: '3.9'
# Settings and configurations that are common for containers
x-nlpcore-common: &nlpcore-common
image: vn-text-moderation:latest
image: registry.vivas.vn/vietnam_text_moderation/vn-text-moderation:latest
restart: always
env_file:
- env_file/minio.env
......@@ -26,7 +26,7 @@ services:
- env_file/minio.env
ports:
- "9090:9000"
- "9091:9001"
- "9091:9001" #UI
volumes:
- ./minio_data:/data
command: server --console-address ":9001" /data
......@@ -50,7 +50,7 @@ services:
- 8080:8080
nlpdata:
image: vn-text-moderation-data
image: registry.vivas.vn/vietnam_text_moderation/vn-text-moderation-data
restart: always
env_file:
- env_file/sql.env
......
......@@ -7817,3 +7817,202 @@
2024/07/22 03:23:52 [notice] 1#1: start worker process 29
2024/07/22 03:23:52 [notice] 1#1: start worker process 30
2024/07/22 03:23:52 [notice] 1#1: start worker process 31
2024/07/31 04:12:44 [notice] 1#1: using the "epoll" event method
2024/07/31 04:12:44 [notice] 1#1: nginx/1.25.0
2024/07/31 04:12:44 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2024/07/31 04:12:44 [notice] 1#1: OS: Linux 6.5.0-45-generic
2024/07/31 04:12:44 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/07/31 04:12:44 [notice] 1#1: start worker processes
2024/07/31 04:12:44 [notice] 1#1: start worker process 19
2024/07/31 04:12:44 [notice] 1#1: start worker process 20
2024/07/31 04:12:44 [notice] 1#1: start worker process 21
2024/07/31 04:12:44 [notice] 1#1: start worker process 22
2024/07/31 04:12:44 [notice] 1#1: start worker process 23
2024/07/31 04:12:44 [notice] 1#1: start worker process 24
2024/07/31 04:12:44 [notice] 1#1: start worker process 25
2024/07/31 04:12:44 [notice] 1#1: start worker process 26
2024/07/31 04:12:44 [notice] 1#1: start worker process 27
2024/07/31 04:12:44 [notice] 1#1: start worker process 28
2024/07/31 04:12:44 [notice] 1#1: start worker process 29
2024/07/31 04:12:44 [notice] 1#1: start worker process 30
2024/08/09 09:45:11 [notice] 1#1: using the "epoll" event method
2024/08/09 09:45:11 [notice] 1#1: nginx/1.25.0
2024/08/09 09:45:11 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2024/08/09 09:45:11 [notice] 1#1: OS: Linux 6.5.0-45-generic
2024/08/09 09:45:11 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/08/09 09:45:11 [notice] 1#1: start worker processes
2024/08/09 09:45:11 [notice] 1#1: start worker process 20
2024/08/09 09:45:11 [notice] 1#1: start worker process 21
2024/08/09 09:45:11 [notice] 1#1: start worker process 22
2024/08/09 09:45:11 [notice] 1#1: start worker process 23
2024/08/09 09:45:11 [notice] 1#1: start worker process 24
2024/08/09 09:45:11 [notice] 1#1: start worker process 25
2024/08/09 09:45:11 [notice] 1#1: start worker process 26
2024/08/09 09:45:11 [notice] 1#1: start worker process 27
2024/08/09 09:45:11 [notice] 1#1: start worker process 28
2024/08/09 09:45:11 [notice] 1#1: start worker process 29
2024/08/09 09:45:11 [notice] 1#1: start worker process 30
2024/08/09 09:45:11 [notice] 1#1: start worker process 31
2024/08/13 07:43:18 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down
2024/08/13 07:43:18 [notice] 23#23: gracefully shutting down
2024/08/13 07:43:18 [notice] 29#29: gracefully shutting down
2024/08/13 07:43:18 [notice] 30#30: gracefully shutting down
2024/08/13 07:43:18 [notice] 24#24: gracefully shutting down
2024/08/13 07:43:18 [notice] 28#28: gracefully shutting down
2024/08/13 07:43:18 [notice] 25#25: gracefully shutting down
2024/08/13 07:43:18 [notice] 27#27: gracefully shutting down
2024/08/13 07:43:18 [notice] 26#26: gracefully shutting down
2024/08/13 07:43:18 [notice] 31#31: gracefully shutting down
2024/08/13 07:43:18 [notice] 21#21: gracefully shutting down
2024/08/13 07:43:18 [notice] 22#22: gracefully shutting down
2024/08/13 07:43:18 [notice] 20#20: gracefully shutting down
2024/08/13 07:43:18 [notice] 23#23: exiting
2024/08/13 07:43:18 [notice] 31#31: exiting
2024/08/13 07:43:18 [notice] 27#27: exiting
2024/08/13 07:43:18 [notice] 24#24: exiting
2024/08/13 07:43:18 [notice] 30#30: exiting
2024/08/13 07:43:18 [notice] 29#29: exiting
2024/08/13 07:43:18 [notice] 28#28: exiting
2024/08/13 07:43:18 [notice] 25#25: exiting
2024/08/13 07:43:18 [notice] 21#21: exiting
2024/08/13 07:43:18 [notice] 26#26: exiting
2024/08/13 07:43:18 [notice] 22#22: exiting
2024/08/13 07:43:18 [notice] 20#20: exiting
2024/08/13 07:43:18 [notice] 20#20: exit
2024/08/13 07:43:18 [notice] 22#22: exit
2024/08/13 07:43:18 [notice] 27#27: exit
2024/08/13 07:43:18 [notice] 23#23: exit
2024/08/13 07:43:18 [notice] 31#31: exit
2024/08/13 07:43:18 [notice] 28#28: exit
2024/08/13 07:43:18 [notice] 30#30: exit
2024/08/13 07:43:18 [notice] 29#29: exit
2024/08/13 07:43:18 [notice] 26#26: exit
2024/08/13 07:43:18 [notice] 24#24: exit
2024/08/13 07:43:18 [notice] 21#21: exit
2024/08/13 07:43:18 [notice] 25#25: exit
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 22
2024/08/13 07:43:19 [notice] 1#1: worker process 20 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 22 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: signal 29 (SIGIO) received
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 28
2024/08/13 07:43:19 [notice] 1#1: worker process 24 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 28 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 30 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: signal 29 (SIGIO) received
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 30
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 23
2024/08/13 07:43:19 [notice] 1#1: worker process 23 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: signal 29 (SIGIO) received
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 31
2024/08/13 07:43:19 [notice] 1#1: worker process 21 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 26 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 27 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: worker process 31 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: signal 29 (SIGIO) received
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 26
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 29
2024/08/13 07:43:19 [notice] 1#1: worker process 29 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: signal 29 (SIGIO) received
2024/08/13 07:43:19 [notice] 1#1: signal 17 (SIGCHLD) received from 25
2024/08/13 07:43:19 [notice] 1#1: worker process 25 exited with code 0
2024/08/13 07:43:19 [notice] 1#1: exit
2024/08/13 07:44:05 [notice] 1#1: using the "epoll" event method
2024/08/13 07:44:05 [notice] 1#1: nginx/1.25.0
2024/08/13 07:44:05 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2024/08/13 07:44:05 [notice] 1#1: OS: Linux 6.5.0-45-generic
2024/08/13 07:44:05 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/08/13 07:44:05 [notice] 1#1: start worker processes
2024/08/13 07:44:05 [notice] 1#1: start worker process 20
2024/08/13 07:44:05 [notice] 1#1: start worker process 21
2024/08/13 07:44:05 [notice] 1#1: start worker process 22
2024/08/13 07:44:05 [notice] 1#1: start worker process 23
2024/08/13 07:44:05 [notice] 1#1: start worker process 24
2024/08/13 07:44:05 [notice] 1#1: start worker process 25
2024/08/13 07:44:05 [notice] 1#1: start worker process 26
2024/08/13 07:44:05 [notice] 1#1: start worker process 27
2024/08/13 07:44:05 [notice] 1#1: start worker process 28
2024/08/13 07:44:05 [notice] 1#1: start worker process 29
2024/08/13 07:44:05 [notice] 1#1: start worker process 30
2024/08/13 07:44:05 [notice] 1#1: start worker process 31
2024/08/21 08:55:15 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down
2024/08/21 08:55:15 [notice] 26#26: gracefully shutting down
2024/08/21 08:55:15 [notice] 24#24: gracefully shutting down
2024/08/21 08:55:15 [notice] 21#21: gracefully shutting down
2024/08/21 08:55:15 [notice] 29#29: gracefully shutting down
2024/08/21 08:55:15 [notice] 30#30: gracefully shutting down
2024/08/21 08:55:15 [notice] 25#25: gracefully shutting down
2024/08/21 08:55:15 [notice] 23#23: gracefully shutting down
2024/08/21 08:55:15 [notice] 28#28: gracefully shutting down
2024/08/21 08:55:15 [notice] 27#27: gracefully shutting down
2024/08/21 08:55:15 [notice] 31#31: gracefully shutting down
2024/08/21 08:55:15 [notice] 22#22: gracefully shutting down
2024/08/21 08:55:15 [notice] 24#24: exiting
2024/08/21 08:55:15 [notice] 21#21: exiting
2024/08/21 08:55:15 [notice] 29#29: exiting
2024/08/21 08:55:15 [notice] 30#30: exiting
2024/08/21 08:55:15 [notice] 25#25: exiting
2024/08/21 08:55:15 [notice] 26#26: exiting
2024/08/21 08:55:15 [notice] 23#23: exiting
2024/08/21 08:55:15 [notice] 28#28: exiting
2024/08/21 08:55:15 [notice] 27#27: exiting
2024/08/21 08:55:15 [notice] 31#31: exiting
2024/08/21 08:55:15 [notice] 22#22: exiting
2024/08/21 08:55:15 [notice] 20#20: gracefully shutting down
2024/08/21 08:55:15 [notice] 20#20: exiting
2024/08/21 08:55:15 [notice] 30#30: exit
2024/08/21 08:55:15 [notice] 29#29: exit
2024/08/21 08:55:15 [notice] 26#26: exit
2024/08/21 08:55:15 [notice] 21#21: exit
2024/08/21 08:55:15 [notice] 31#31: exit
2024/08/21 08:55:15 [notice] 22#22: exit
2024/08/21 08:55:15 [notice] 27#27: exit
2024/08/21 08:55:15 [notice] 20#20: exit
2024/08/21 08:55:15 [notice] 24#24: exit
2024/08/21 08:55:15 [notice] 25#25: exit
2024/08/21 08:55:15 [notice] 28#28: exit
2024/08/21 08:55:15 [notice] 23#23: exit
2024/08/21 08:55:15 [notice] 1#1: signal 17 (SIGCHLD) received from 26
2024/08/21 08:55:15 [notice] 1#1: worker process 25 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: worker process 26 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: worker process 22 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: worker process 28 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: signal 29 (SIGIO) received
2024/08/21 08:55:15 [notice] 1#1: signal 17 (SIGCHLD) received from 22
2024/08/21 08:55:15 [notice] 1#1: signal 17 (SIGCHLD) received from 23
2024/08/21 08:55:15 [notice] 1#1: worker process 23 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: worker process 30 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: signal 29 (SIGIO) received
2024/08/21 08:55:15 [notice] 1#1: signal 17 (SIGCHLD) received from 30
2024/08/21 08:55:15 [notice] 1#1: worker process 27 exited with code 0
2024/08/21 08:55:15 [notice] 1#1: signal 29 (SIGIO) received
2024/08/21 08:55:15 [notice] 1#1: signal 17 (SIGCHLD) received from 27
2024/08/21 08:55:16 [notice] 1#1: signal 17 (SIGCHLD) received from 21
2024/08/21 08:55:16 [notice] 1#1: worker process 21 exited with code 0
2024/08/21 08:55:16 [notice] 1#1: worker process 29 exited with code 0
2024/08/21 08:55:16 [notice] 1#1: worker process 24 exited with code 0
2024/08/21 08:55:16 [notice] 1#1: signal 29 (SIGIO) received
2024/08/21 08:55:16 [notice] 1#1: signal 17 (SIGCHLD) received from 24
2024/08/21 08:55:16 [notice] 1#1: signal 17 (SIGCHLD) received from 31
2024/08/21 08:55:16 [notice] 1#1: worker process 31 exited with code 0
2024/08/21 08:55:16 [notice] 1#1: signal 29 (SIGIO) received
2024/08/21 08:55:16 [notice] 1#1: signal 17 (SIGCHLD) received from 20
2024/08/21 08:55:16 [notice] 1#1: worker process 20 exited with code 0
2024/08/21 08:55:16 [notice] 1#1: exit
2024/08/21 08:55:37 [notice] 1#1: using the "epoll" event method
2024/08/21 08:55:37 [notice] 1#1: nginx/1.25.0
2024/08/21 08:55:37 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2024/08/21 08:55:37 [notice] 1#1: OS: Linux 6.5.0-45-generic
2024/08/21 08:55:37 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/08/21 08:55:37 [notice] 1#1: start worker processes
2024/08/21 08:55:37 [notice] 1#1: start worker process 20
2024/08/21 08:55:37 [notice] 1#1: start worker process 21
2024/08/21 08:55:37 [notice] 1#1: start worker process 22
2024/08/21 08:55:37 [notice] 1#1: start worker process 23
2024/08/21 08:55:37 [notice] 1#1: start worker process 24
2024/08/21 08:55:37 [notice] 1#1: start worker process 25
2024/08/21 08:55:37 [notice] 1#1: start worker process 26
2024/08/21 08:55:37 [notice] 1#1: start worker process 27
2024/08/21 08:55:37 [notice] 1#1: start worker process 28
2024/08/21 08:55:37 [notice] 1#1: start worker process 29
2024/08/21 08:55:37 [notice] 1#1: start worker process 30
2024/08/21 08:55:37 [notice] 1#1: start worker process 31
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment