Files
api/README.md
2025-04-10 09:45:41 +00:00

144 lines
5.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# API集合
该项目为所有API集合,集成了视觉分析、聊天对话和语音处理等功能。
## 更新日志
- 20250410 部署到docker
- 20250403 部署到222.186.20.67时的更新
## 项目结构
```
API/
├── api/ # 视觉分析和处理模块
│ ├── cpm_analyze.py # CPM_OCR分析
│ ├── qwenvl_analyze.py # QwenVL_OCR分析
│ ├── cpm_scene.py # CPM_场景分析
│ ├── qwenvl_scene.py # QwenVL_场景分析
│ ├── compare.py # insightface 面部特征提取
│ ├── yolo.py # YOLO目标检测
│ ├── face.py # 人脸检测
│ ├── fall.py # 跌倒检测
│ ├── pose.py # 姿态估计
│ ├── media.py # mediapipe 面部特征提取
| ├── start_services.sh #一键开始所有程序
| └── stop.sh #一键停止所有程序
├── api_chat/ # 聊天和语音处理模块
│ ├── chat.py # 聊天功能
│ ├── tts.py # 文字转语音
│ ├── asr.py # 语音识别
│ ├── GPT_SoVITS/ # GPT_SoVITS模型集成,
│ ├── sample/ # OpenBMB模型——学习音色,音色+文本内容
│ ├── tools/ # GPT_SoVITS模型——工具函数
│ ├── runtime/ # GPT_SoVITS模型——运行时函数
│ ├── docs/ # GPT_SoVITS模型——文档
| └── weight.json # GPT_SoVITS模型——权重
|
├── producer_chat/ # 聊天生产者
├── producer/ # 算法生产者,分配任务
└── README # 说明文档
```
## 主要功能
### 视觉分析模块 (api/) 支持图片、视频
- 目标检测 yolov8x
- 人脸检测 yolov8n-face
- 面部特征提取 insightface 、mediapipe
- 姿态估计 yolov8x-pose
- 跌倒检测 yolov8n-fall
- 场景理解、OCR分析(基于MiniCPM-v2.6和QwenVL-2B模型)
### 聊天对话模块 (api_chat/)
- 文本对话功能(Ollama-qwen2.5:3b
- 语音识别 (ASR): 通过Whisper-large-v3模型
- 文字转语音 (TTS): 通过GPT_SoVITS模型
- 多模型支持(通过Ollama
## 使用说明
### API 部分 http://dev2.obscura.work/v1
1. producer 目录 # 生产者,分配任务
2. 服务器:222.186.20.67:8005
3. kafka 配置:222.186.20.67:9092
topic分配:
- yolo: "yolo"
- pose: "pose"
- qwenvl: "qwenvl"
- qwenvl_analyze: "qwenvl_analyze"
- cpm: "cpm"
- cpm_analyze: "cpm_analyze"
- fall: "fall"
- face: "face"
- mediapipe: "mediapipe"
- compare: "compare"
4. redis 配置:222.186.20.67:6379
db分配:
- 4: "yolo"
- 5: "pose"
- 9: "qwenvl"
- 32: "qwenvl_analyze"
- 8: "cpm"
- 31: "cpm_analyze"
- 6: "fall"
- 7: "face"
- 10: "mediapipe"
- 30: "compare"
5. 模型配置:
- YOLO = "/obscura/models/yolo11n.pt"
- POSE = "/obscura/models/yolo11n-pose.pt"
- QWEN = "/obscura/models/qwen/Qwen2.5-VL-7B-Instruct"
- FALL = "/obscura/models/yolov8n-fall.pt"
- FACE = "/obscura/models/yolo11n-face.pt"
- MEDIAPIPE = "/obscura/models/face_landmarker.task"
- CPM(ollama) = "https://222.186.20.67:11435/api/generate"
6. 上传文件及结果保存目录:
- UPLOAD_DIR = "/obscura/task/upload"
- RESULT_DIR = "/obscura/task/result"
### API_Chat 部分 http://dev2.obscura.work/v1_chat
1. producer_chat 目录 # 聊天生产者
2. 服务器:222.186.20.67:8008
3. kafka 配置:222.186.20.67:9092
topic分配:
- asr: "asr"
- chat: "chat"
- tts: "tts"
4. redis 配置:222.186.20.67:6379
db分配:
- 2: "api key"
- 3: "api使用情况"
- 11: "task 任务记录"
- 12: "asr 记录"
- 13: "chat 记录"
- 14: "tts 记录"
#语言
- 63: "session_zh 中文"
- 62: "session_en 英文"
- 61: "session_ko 韩语"
#音色
- 15: "girl"
- 16: "woman"
- 17: "man"
- 18: "leijun"
- 19: "dufu"
- 20: "hejiong"
- 21: "mahuateng"
- 22: "lidan"
- 23: "dabing"
- 24: "luoxiang"
- 25: "xuzhiyuan"
- 26: "yuhua"
- 27: "liuzhenyun"
5. 音频文件保存目录:
- OUTPUT_PATH=/obscura/task/audio_files # 音频文件保存目录
## 注意事项
- 注意redis的db分配
- 注意kafka的topic分配
- 注意producer的config.py环境配置
- 注意producer_chat的.env环境配置
- 确保模型权重文件已正确配置
- 检查API密钥和环境变量设置
- 注意资源使用和性能优化