[Triton Inference Server] Server reload없이 model update --model-control-mode option

728x90

Triton Inference Server에서 model-control-mode option이란

기존에 triton server에 모델을 올릴 때 update하면 server를 내렸다가 다시 load하는 식으로 진행해 왔었는데, 찾아보니 "--model-control-mode"라는 option이 있었다.

"--model-control-mode= " option에는 3가지가 있는데 default인 None과 EXPLICIT, 그리고 POLL이 있다.

여기서 POLL을 option으로 주게 되면 model-repository에 모델을 변경하거나 update하였을 때 자동으로 update된 부분에 대해 update를 해준다.

자세한 설명은 밑의 링크를 보면 설명이 되어 있다.

https://github.com/triton-inference-server/server/blob/main/docs/

GitHub - triton-inference-server/server: The Triton Inference Server provides an optimized cloud and edge inferencing solution.

The Triton Inference Server provides an optimized cloud and edge inferencing solution. - GitHub - triton-inference-server/server: The Triton Inference Server provides an optimized cloud and edge i...

github.com

POLL option을 하였을 경우

triton server를 run하게 되면 다음과 같이 model repository에 올린 모델에 대해 load가 된다.

여기서 model repository에 새로운 모델을 올리면 다음과 같이 sucessfully loaded라는 log가 뜨게 되고,

만약 올린 모델에 대해 맞지 않는 부분이 있으면 다음과 같이 오류가 발생한다.

밑의 오류는 다른 gpu에서 tensorrt engine으로 변환해서 발생한 오류이다.

728x90

저작자표시

'Development > Triton Inference Server' 카테고리의 다른 글

[Triton Inference Server] CUDA driver version is insufficient or CUDA runtime version (0)	2022.11.04
[Triton Inference Server] batch size does not match other inputs error (2)	2022.03.14
[Triton Inference Server] Triton Inference Server model ensemble (0)	2022.02.04
[Triton Inference Server] Docker를 사용한 triton server 예제 실행 (0)	2022.01.26