This commit is contained in:
2025-01-23 09:05:45 +00:00
parent b2b65e14bb
commit f3a92bb514
60 changed files with 1205 additions and 20735 deletions
+8 -15
View File
@@ -1,24 +1,17 @@
# 忽略文件夹中的所有内容,但保留文件夹本身 # 忽略文件夹中的所有内容,但保留文件夹本身
# 原始视频文件夹 # 原始视频文件夹
recordings/* files/recordings/*
!recordings/.gitkeep !files/recordings/.gitkeep
# 原始图像文件夹 # 原始图像文件夹
images/* files/images/*
!images/.gitkeep !files/images/.gitkeep
# 原始图像裁剪人体文件夹 # 原始图像裁剪人体文件夹
crop/* files/crop/*
!crop/.gitkeep !files/crop/.gitkeep
# 人脸对比结果
result/*
!result/.gitkeep
# 人脸对比文件夹 # 人脸对比文件夹
data/* files/data/*
!data/.gitkeep !files/data/.gitkeep
# 测试数据集
dataset/*
!dataset/.gitkeep
+96 -53
View File
@@ -1,72 +1,115 @@
# 监控视频分析系统 # 监控视频分析系统
## 项目简介 ## 项目简介
本项目是一个基于视觉语言模型的行为识别系统,用于识别和分析人类行为 本项目是一个基于视觉语言模型的行为识别系统,用于识别和分析监控视频中的人类行为。系统包含视频采集、人脸识别、行为分析和Web展示等多个模块
## 更新日志 ## 更新日志
- 2025-01-23 更新代码结构
- 2025-01-15 将生成行为分析报告功能改为后台进行,不阻塞其他路由使用,API从siliconflow改为deepseek - 2025-01-15 将生成行为分析报告功能改为后台进行,不阻塞其他路由使用,API从siliconflow改为deepseek
- 2025-01-14 修复了时间轴事件在筛选后显示不全的问题,更新了时间轴颜色配置,优化了动作类别 - 2025-01-14 修复了时间轴事件在筛选后显示不全的问题,更新了时间轴颜色配置,优化了动作类别
- 2025-01-12 Initial commit - 2025-01-12 Initial commit
## 文件夹结构 ### 后端模块 (app/)
WEB/ - `main.py`: FastAPI Web应用主程序
├── recordings/ # 原始视频文件存储目录 - `models.py`: 数据模型和业务逻辑
├── images/ # 原始图像文件存储目录 - `config.py`: 系统配置文件
├── crop/ # 裁剪后的人体图像存储目录
├── data/ # 人脸对比数据存储目录
├── dataset/ # 测试数据集,可以忽略
├── test_history/ # 测试代码,可以忽略
├── web/ # 前端页面历史记录
|#后端
├── main.py # 后端主程序
├── monitor_images.py # 视频抽帧程序
├── pose_monitor.py # 人体识别程序
├── face_monitor.py # 人脸识别程序
├── emb.py # 人脸数据注册程序
├── qwen_monitor.py # qwen分析-行为识别程序
├── video_realtime.py # 实时视频监控程序(本地)
|#前端
├── info.json # qwen分析-行为类别、环境类别配置
├── cls.js # web页面时间轴-行为类别及颜色配置
├── color_test.html # 行为类别颜色测试页面,可以忽略
└── web.html # 前端页面
## 使用说明
### 文件夹说明
1. 将原始视频文件放入 `recordings` 目录
2. 系统会自动处理视频并对视频抽帧保存到 `images` 目录
3. 系统会自动处理图像并将裁剪后的人体图像保存到 `crop` 目录
4. 人脸注册图片保存在 `data` 目录,结构为 `data/face_name/face_id.jpg`,一个人可有多张图片
5. 前端页面保存历史记录在 `web` 目录
6. test_history 保存了测试时的一些代码,可以忽略
7. dataset 保存了测试时的一些数据集,可以忽略
### 文件说明 ### 功能模块 (function/)
1. **视频采集模块**
- `rtsp2video.py`: RTSP视频流采集和录制
- `video2image.py`: 视频关键帧提取
#### 前端显示 https://beta.obscura.work/web/web.html 2. **人脸分析模块**
1. cls.js web页面时间轴-行为类别及颜色配置 - `face.py`: 人脸检测和识别
2. web.html 前端页面 - `face-emb.py`: 人脸特征提取
3. color_test.html 行为类别颜色测试页面,可以忽略 - `pose.py`: 人体姿态检测
#### 后端处理 https://dev.obscura.work/web 3. **行为分析模块**
- `qwen.py`: 基于Qwen2-VL-7B模型的视频内容分析
- `info.json`: 行为和场景类别定义
1. main.py 主程序 ### 前端界面 (frontend/)
- 服务器:222.186.10.253 - `web.html`: 主界面文件
- 端口:6005 - `cls.js`: 行为类别及颜色配置
2. info.json qwen分析-行为类别、环境类别配置 ### 数据存储 (files/)
3. qwen_monitor.py 行为识别程序,监控recordings目录数据传入,结果保存到redis - `recordings/`: 视频录制目录
4. monitor_images.py 视频抽帧程序,监控recordings目录数据传入,对视频抽帧保存到images目录 - `images/`: 图片存储目录
5. pose_monitor.py 人体识别程序, 使用Yolo-pose监控images目录数据传入,识别并裁剪人体保存到crop目录 - `crop/`: 人体裁剪图片目录
6. face_monitor.py 人脸识别程序,监控crop目录数据传入,识别结果保存到redis - `data/`: 人脸特征数据目录
7. emb.py 人脸数据注册,数据保存到redis
8. del.ipynb 一些测试代码,可以忽略 ## 主要功能
9. (本地)video_realtime.py 实时视频监控程序,zai 本地监控并捕获rtsp流,保存到本地recordings目录并上传到服务器
### 1. 视频采集
- 支持多路RTSP视频流采集
- 定时录制视频片段
- SFTP自动上传备份
### 2. 人脸识别
- 实时人脸检测
- 人脸特征提取和匹配
- 身份识别和记录
### 3. 行为分析
- 人体姿态检测
- 行为识别和分类
- 场景理解
- 异常行为检测
### 4. Web接口
- 摄像头数据查询
- 分析报告生成
- 历史数据下载
## 部署说明
### 服务器配置
- 后端API: https://dev.obscura.work/web
- 前端页面: https://beta.obscura.work/web/web.html
- 服务器: 222.186.10.253
- 端口: 6005
### Redis配置 ### Redis配置
1. 服务器222.186.10.253 - 服务器: 222.186.10.253
2. 使用db:摄像头:207-211,人脸注册数据:212,分析报告:213 - 数据库映射:
- 摄像头数据: 210、211
- 人脸注册数据: 212
- 分析报告: 213
### 部署步骤
1. 确保配置文件中的各项参数正确设置
2. 启动Redis服务
3. 使用启动脚本运行所有服务:
```bash
chmod +x start.sh stop.sh # 首次运行时赋予执行权限
./start.sh # 启动所有服务
./stop.sh # 停止所有服务
```
或者手动依次启动各个模块:
- `python app/function/rtsp2video.py`
- `python app/function/video2image.py`
- `python app/function/pose.py`
- `python app/function/face.py`
- `python app/function/qwen.py`
4. 启动web程序: `python app/app/main.py`
### 服务管理
- 启动脚本 `start.sh` 会自动:
- 创建日志目录
- 检查必要文件是否存在
- 按顺序启动所有服务
- 监控服务启动状态
- 所有日志保存在 `logs/` 目录下
- 停止脚本 `stop.sh` 会:
- 按顺序停止所有Python进程
- 检查进程是否成功停止
- 显示任何未能正常停止的进程
## 注意事项 ## 注意事项
- 所有目录都已通过 `.gitkeep` 文件保持在版本控制中 - 所有目录都已通过 `.gitkeep` 文件保持在版本控制中
- 各目录中的实际数据文件已通过 `.gitignore` 配置忽略 - 各目录中的实际数据文件已通过 `.gitignore` 配置忽略
- 确保系统有足够的GPU内存用于模型推理
- 定期清理过期数据
- 监控系统资源使用情况
+90
View File
@@ -0,0 +1,90 @@
# 行为分析系统
## 项目概述
这是一个基于视频分析的行为识别系统,能够从监控视频中识别和分析人员行为,生成分析报告。系统包含视频采集、人脸识别、行为分析和Web接口等多个模块。
## 系统架构
### Web应用层 (app/)
- `main.py`: FastAPI Web应用主程序,提供REST API接口
- `models.py`: 数据模型和业务逻辑处理
- `config.py`: 系统配置文件
### 功能模块层 (function/)
1. **视频采集模块**
- `rtsp2video.py`: RTSP视频流采集和录制
- `video2image.py`: 视频关键帧提取
2. **人脸分析模块**
- `face.py`: 人脸检测和识别
- `face-emb.py`: 人脸特征提取
- `pose.py`: 人体姿态检测
3. **行为分析模块**
- `qwen.py`: 基于Qwen2-VL-7B模型的视频内容分析
- `info.json`: 行为和场景类别定义
## 主要功能
### 1. 视频采集
- 支持多路RTSP视频流采集
- 定时录制视频片段
- SFTP自动上传备份
### 2. 人脸识别
- 实时人脸检测
- 人脸特征提取和匹配
- 身份识别和记录
### 3. 行为分析
- 人体姿态检测
- 行为识别和分类
- 场景理解
- 异常行为检测
### 4. Web接口
- 摄像头数据查询
- 分析报告生成
- 历史数据下载
## 数据存储
- 使用Redis数据库存储处理结果
- 支持多个数据库实例,用于不同摄像头数据隔离
- 实现数据缓存和过期策略
## 配置说明
系统配置集中在`config.py`文件中,包括:
- Redis连接配置
- 摄像头数据库映射
- 行为类别定义
- 缓存策略设置
- AI模型配置
## 部署要求
- Python 3.8+
- CUDA支持
- Redis服务器
- 足够的存储空间用于视频存储
## 使用说明
1. 确保配置文件中的各项参数正确设置
2. 启动Redis服务
3. 运行视频采集模块:`python rtsp2video.py`
4. 运行分析模块:
- `python video2image.py`
- `python pose.py`
- `python face.py`
- `python qwen.py`
5. 启动Web服务:`python main.py`
## API接口
- GET `/web/face/{camera_id}/data`: 获取人脸识别数据
- GET `/web/{camera_id}/data`: 获取行为分析数据
- GET `/web/report/{date}`: 获取日期分析报告
- GET `/web/report/download/{date}`: 下载分析报告
## 注意事项
- 确保系统有足够的GPU内存用于模型推理
- 定期清理过期数据
- 监控系统资源使用情况
- 及时处理错误日志
+91
View File
@@ -0,0 +1,91 @@
from redis import Redis
from datetime import timedelta
from openai import OpenAI
# Redis配置
REDIS_CONFIG = {
"host": "222.186.10.253",
"port": 6379,
"password": "Obscura@2024",
"decode_responses": True
}
# 摄像头数据库映射
CAMERA_DB_MAPPING = {
"A01": 210,
"B02": 211,
"C03": 212,
"report": 213 # 分析报告使用213数据库
}
# 创建Redis连接池
def create_redis_connections():
redis_connections = {}
for camera_id, db in CAMERA_DB_MAPPING.items():
redis_connections[camera_id] = Redis(
**REDIS_CONFIG,
db=db
)
return redis_connections
# SiliconFlow API配置
SILICON_FLOW_CONFIG = {
"base_url": "https://api.deepseek.com/v1",
"api_key": "sk-3027fb3c810b4e17985fa397d41250b9"
}
# 初始化OpenAI客户端
ai_client = OpenAI(
base_url=SILICON_FLOW_CONFIG["base_url"],
api_key=SILICON_FLOW_CONFIG["api_key"]
)
# 行为类别配置
BEHAVIOR_CATEGORIES = {
"基础动作": [
"", "站立", "站着",
"", "走路", "散步", "行走", "徒步",
"", "奔跑", "慢跑",
"", "坐下", "坐着",
"", "蹲下", "蹲着",
"", "转身", "转头", "回头", "旋转", "转向", "转弯",
"", "", "", ""
],
"日常生活": [
"", "食用", "吃饭", "吃零食", "吃东西", "用餐", "咀嚼", "",
"喝水", "喝牛奶", "喝茶", "饮用", "喝咖啡", "", "饮水",
"穿衣服", "穿裤子", "穿鞋", "戴帽子", "戴口罩", "戴围巾",
"", "", "睡觉", "休息", "打哈欠",
"洗澡", "刷牙", "洗手", "洗涤", "清洁", "擦洗",
"吃药", "喝药", "服药"
],
"社交活动": [
"说话", "交流", "演讲", "谈话", "聊天", "采访", "社交",
"打麻将", "打牌", "玩手机", "玩电脑", "玩游戏", "赌博",
"", "大笑", "微笑", "哭泣", "咯咯笑", "皱眉"
],
"工作学习": [
"读书", "阅读", "看书",
"写作", "写字", "",
"工作", "学习", "使用电脑", "使用笔记本电脑", "使用手机", "开会", "打字",
"画画", "绘画", "摄影", "素描"
],
"运动娱乐": [
"", "跳跃", "跳舞", "游泳", "运动", "健身", "锻炼"
],
"异常行为": [
'打架', '斗殴', '摔倒', '晕倒', '昏倒', '跌倒', '滑倒',
'', '', '受伤', '暴力', '攻击', '威胁', '破坏',
'偷窃', '抢夺', '游荡', '徘徊', '尾随', '骚扰'
],
"其他": ["其他"]
}
# 异常行为列表
ABNORMAL_BEHAVIORS = BEHAVIOR_CATEGORIES["异常行为"]
# Redis缓存配置
REDIS_CACHE_CONFIG = {
"report_expiry": timedelta(days=30), # 报告缓存30天
"task_status_expiry": timedelta(hours=1) # 任务状态缓存1小时
}
+179
View File
@@ -0,0 +1,179 @@
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from datetime import datetime
from typing import Optional
import json
from concurrent.futures import ThreadPoolExecutor
from .config import create_redis_connections
from .models import (
get_camera_data_by_date,
background_generate_report
)
app = FastAPI()
# 配置CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# 创建线程池执行器
executor = ThreadPoolExecutor(max_workers=3)
# 用于跟踪报告生成任务的状态
report_tasks = {}
# 创建Redis连接池
redis_connections = create_redis_connections()
@app.get("/web/face/{camera_id}/data")
async def get_camera_data(camera_id: str, date: Optional[str] = None):
"""获取摄像头人脸数据"""
if date is None:
date = datetime.now().strftime("%Y%m%d")
return await get_camera_data_by_date(camera_id, date, redis_connections[camera_id], is_face=True)
@app.get("/web/{camera_id}/data")
async def get_camera_data(camera_id: str, date: Optional[str] = None):
"""获取摄像头行为数据"""
if date is None:
date = datetime.now().strftime("%Y%m%d")
return await get_camera_data_by_date(camera_id, date, redis_connections[camera_id])
@app.get("/web/report/{date}")
async def get_report_by_date(date: str):
"""获取指定日期的分析报告"""
try:
print(f"\n=== 开始处理日期: {date} ===")
try:
parsed_date = datetime.strptime(date, "%Y-%m-%d")
date_no_hyphen = parsed_date.strftime("%Y%m%d")
print(f"转换后的日期格式: {date_no_hyphen}")
except ValueError:
raise HTTPException(
status_code=400,
detail="Invalid date format. Please use YYYY-MM-DD"
)
report_redis = redis_connections["report"]
report_key = f"report_{date_no_hyphen}"
task_key = f"task_status_{date_no_hyphen}"
print(f"缓存键: {report_key}")
cached_report = report_redis.get(report_key)
print(f"是否有缓存: {'' if cached_report else ''}")
if cached_report:
print("返回缓存数据")
return {
"message": "success",
"data": json.loads(cached_report),
"source": "cache"
}
task_status_json = report_redis.get(task_key)
if task_status_json:
task_status = json.loads(task_status_json)
if task_status["status"] == "running":
return {
"message": "processing",
"detail": "报告正在生成中,请稍后再试"
}
elif task_status["status"] == "completed":
report_data = report_redis.get(report_key)
if report_data:
return {
"message": "success",
"data": json.loads(report_data),
"source": "new_generation"
}
else:
return {
"message": "no_data",
"detail": "暂无数据"
}
elif task_status["status"] == "failed":
return {
"message": "error",
"detail": f"报告生成失败: {task_status.get('error', '未知错误')}"
}
print("启动后台任务生成报告...")
executor.submit(background_generate_report, date_no_hyphen, redis_connections)
return {
"message": "processing",
"detail": "报告正在生成中,请稍后再试"
}
except Exception as e:
print(f"Error processing report request: {str(e)}")
import traceback
print(f"详细错误信息: {traceback.format_exc()}")
raise HTTPException(
status_code=500,
detail=f"处理报告时发生错误: {str(e)}\n错误类型: {type(e)}"
)
@app.get("/web/report/download/{date}")
async def download_report(date: str):
"""下载指定日期的分析报告"""
try:
try:
parsed_date = datetime.strptime(date, "%Y-%m-%d")
date_no_hyphen = parsed_date.strftime("%Y%m%d")
except ValueError:
error_msg = "日期格式必须为YYYY-MM-DD(例如:2024-12-31"
print(f"日期格式错误: {error_msg}")
raise HTTPException(
status_code=400,
detail=error_msg
)
report_key = f"report_{date_no_hyphen}"
report_redis = redis_connections["report"]
report_data = report_redis.get(report_key)
if not report_data:
error_msg = f"未找到 {date} 的报告数据,请先生成报告"
print(error_msg)
raise HTTPException(
status_code=404,
detail=error_msg
)
try:
data = json.loads(report_data)
if "hourly_distribution" in data:
del data["hourly_distribution"]
print("成功获取报告数据")
return {
"message": "success",
"data": data
}
except json.JSONDecodeError as je:
error_msg = f"报告数据格式错误: {str(je)}"
print(error_msg)
raise HTTPException(
status_code=500,
detail=error_msg
)
except HTTPException as he:
raise he
except Exception as e:
error_msg = f"处理请求时发生错误: {str(e)}"
print(error_msg)
raise HTTPException(
status_code=500,
detail=error_msg
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=6005)
+34 -424
View File
@@ -1,70 +1,25 @@
from fastapi import FastAPI, HTTPException from datetime import datetime
from fastapi.middleware.cors import CORSMiddleware
from redis import Redis
from openai import OpenAI
import json import json
from datetime import datetime, timedelta
from typing import Dict, List, Optional
import asyncio import asyncio
import threading from typing import Dict
from concurrent.futures import ThreadPoolExecutor from fastapi import HTTPException
from openai import OpenAI
app = FastAPI() from .config import (
CAMERA_DB_MAPPING,
# 配置CORS BEHAVIOR_CATEGORIES,
app.add_middleware( ABNORMAL_BEHAVIORS,
CORSMiddleware, REDIS_CACHE_CONFIG,
allow_origins=["*"], # 在生产环境中应该设置具体的域名 ai_client
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
) )
# 创建线程池执行器 async def get_camera_data_by_date(camera_id: str, date: str, redis_client, is_face: bool = False):
executor = ThreadPoolExecutor(max_workers=3) # 限制最大并发任务数 """获取摄像头某天的所有数据"""
# 用于跟踪报告生成任务的状态
report_tasks = {}
# 定义摄像头和数据库的映射关系
CAMERA_DB_MAPPING = {
"camera001": 207,
"camera002": 208,
"camera003": 209,
"A01": 210,
"B02": 211,
"C03": 212,
"report": 213 # 分析报告使用213数据库
}
# 创建Redis连接池
redis_connections = {}
for camera_id, db in CAMERA_DB_MAPPING.items():
redis_connections[camera_id] = Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=db,
decode_responses=True
)
@app.get("/web/face/{camera_id}/data")
async def get_camera_data(camera_id: str, date: Optional[str] = None):
"""
获取摄像头某天的所有数据
"""
try: try:
if camera_id not in CAMERA_DB_MAPPING: if camera_id not in CAMERA_DB_MAPPING:
raise HTTPException(status_code=400, detail="Invalid camera ID") raise HTTPException(status_code=400, detail="Invalid camera ID")
# 如果没有指定日期,使用当前日期
if date is None:
date = datetime.now().strftime("%Y%m%d")
redis_client = redis_connections[camera_id]
# 使用新的键格式进行模式匹配 # 使用新的键格式进行模式匹配
pattern = f"face_{camera_id}_{date}_*" pattern = f"{'face_' if is_face else ''}{camera_id}_{date}_*"
all_keys = redis_client.keys(pattern) all_keys = redis_client.keys(pattern)
if not all_keys: if not all_keys:
@@ -75,46 +30,6 @@ async def get_camera_data(camera_id: str, date: Optional[str] = None):
for key in all_keys: for key in all_keys:
data = redis_client.get(key) data = redis_client.get(key)
if data: if data:
# 直接解析JSON数据,无需decode
all_data[key] = json.loads(data)
return {
"message": "success",
"data": all_data,
"total_records": len(all_data)
}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/web/{camera_id}/data")
async def get_camera_data(camera_id: str, date: Optional[str] = None):
"""
获取摄像头某天的所有数据
"""
try:
if camera_id not in CAMERA_DB_MAPPING:
raise HTTPException(status_code=400, detail="Invalid camera ID")
# 如果没有指定日期,使用当前日期
if date is None:
date = datetime.now().strftime("%Y%m%d")
redis_client = redis_connections[camera_id]
# 使用新的键格式进行模式匹配
pattern = f"{camera_id}_{date}_*"
all_keys = redis_client.keys(pattern)
if not all_keys:
return {"message": "No data found", "data": None}
# 获取所有键的数据并解析
all_data = {}
for key in all_keys:
data = redis_client.get(key)
if data:
# 直接解析JSON数据,无需decode
all_data[key] = json.loads(data) all_data[key] = json.loads(data)
return { return {
@@ -126,49 +41,43 @@ async def get_camera_data(camera_id: str, date: Optional[str] = None):
except Exception as e: except Exception as e:
raise HTTPException(status_code=500, detail=str(e)) raise HTTPException(status_code=500, detail=str(e))
def background_generate_report(date_no_hyphen: str): def background_generate_report(date_no_hyphen: str, redis_connections):
"""后台生成报告的函数""" """后台生成报告的函数"""
try: try:
print(f"\n=== 后台任务开始生成报告 {date_no_hyphen} ===") print(f"\n=== 后台任务开始生成报告 {date_no_hyphen} ===")
# 获取事件循环
loop = asyncio.new_event_loop() loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop) asyncio.set_event_loop(loop)
# 更新Redis中的任务状态为运行中
report_redis = redis_connections["report"] report_redis = redis_connections["report"]
task_key = f"task_status_{date_no_hyphen}" task_key = f"task_status_{date_no_hyphen}"
report_redis.setex( report_redis.setex(
task_key, task_key,
timedelta(hours=1), # 状态保存1小时 REDIS_CACHE_CONFIG["task_status_expiry"],
json.dumps({"status": "running"}) json.dumps({"status": "running"})
) )
# 运行异步任务
print("开始调用 generate_daily_report...") print("开始调用 generate_daily_report...")
report = loop.run_until_complete(generate_daily_report(date_no_hyphen)) report = loop.run_until_complete(generate_daily_report(date_no_hyphen, redis_connections))
print(f"报告生成结果: {report.get('message', 'unknown')}") print(f"报告生成结果: {report.get('message', 'unknown')}")
# 如果生成成功,保存到Redis
if report.get("message") != "no_data": if report.get("message") != "no_data":
print("报告生成成功,准备保存到Redis...") print("报告生成成功,准备保存到Redis...")
report_key = f"report_{date_no_hyphen}" report_key = f"report_{date_no_hyphen}"
report_redis.setex( report_redis.setex(
report_key, report_key,
timedelta(days=30), REDIS_CACHE_CONFIG["report_expiry"],
json.dumps(report) json.dumps(report)
) )
print(f"报告已保存到Redis,键名: {report_key}") print(f"报告已保存到Redis,键名: {report_key}")
# 更新任务状态为已完成
report_redis.setex( report_redis.setex(
task_key, task_key,
timedelta(hours=1), REDIS_CACHE_CONFIG["task_status_expiry"],
json.dumps({"status": "completed"}) json.dumps({"status": "completed"})
) )
else: else:
# 如果无数据,更新任务状态为已完成,但不保存报告
report_redis.setex( report_redis.setex(
task_key, task_key,
timedelta(hours=1), REDIS_CACHE_CONFIG["task_status_expiry"],
json.dumps({"status": "completed"}) json.dumps({"status": "completed"})
) )
@@ -176,10 +85,9 @@ def background_generate_report(date_no_hyphen: str):
except Exception as e: except Exception as e:
print(f"报告生成失败: {str(e)}") print(f"报告生成失败: {str(e)}")
# 更新Redis中的任务状态为失败
report_redis.setex( report_redis.setex(
task_key, task_key,
timedelta(hours=1), REDIS_CACHE_CONFIG["task_status_expiry"],
json.dumps({ json.dumps({
"status": "failed", "status": "failed",
"error": str(e) "error": str(e)
@@ -190,182 +98,10 @@ def background_generate_report(date_no_hyphen: str):
loop.close() loop.close()
print(f"=== 后台任务结束 {date_no_hyphen} ===\n") print(f"=== 后台任务结束 {date_no_hyphen} ===\n")
@app.get("/web/report/{date}") async def generate_daily_report(date: str, redis_connections) -> Dict:
async def get_report_by_date(date: str): """生成每日分析报告"""
"""
获取指定日期的分析报告
:param date: 日期格式为YYYY-MM-DD
"""
try:
print(f"\n=== 开始处理日期: {date} ===")
# 验证日期格式并转换为无连字符格式
try:
parsed_date = datetime.strptime(date, "%Y-%m-%d")
date_no_hyphen = parsed_date.strftime("%Y%m%d")
print(f"转换后的日期格式: {date_no_hyphen}")
except ValueError:
raise HTTPException(
status_code=400,
detail="Invalid date format. Please use YYYY-MM-DD"
)
# 使用report数据库存储报告
report_redis = redis_connections["report"]
report_key = f"report_{date_no_hyphen}"
task_key = f"task_status_{date_no_hyphen}"
print(f"缓存键: {report_key}")
# 尝试获取缓存的报告
cached_report = report_redis.get(report_key)
print(f"是否有缓存: {'' if cached_report else ''}")
if cached_report:
print("返回缓存数据")
return {
"message": "success",
"data": json.loads(cached_report),
"source": "cache"
}
# 检查Redis中的任务状态
task_status_json = report_redis.get(task_key)
if task_status_json:
task_status = json.loads(task_status_json)
if task_status["status"] == "running":
return {
"message": "processing",
"detail": "报告正在生成中,请稍后再试"
}
elif task_status["status"] == "completed":
# 检查是否有报告数据
report_data = report_redis.get(report_key)
if report_data:
return {
"message": "success",
"data": json.loads(report_data),
"source": "new_generation"
}
else:
return {
"message": "no_data",
"detail": "暂无数据"
}
elif task_status["status"] == "failed":
return {
"message": "error",
"detail": f"报告生成失败: {task_status.get('error', '未知错误')}"
}
# 启动后台任务生成报告
print("启动后台任务生成报告...")
# 设置初始任务状态
report_redis.setex(
task_key,
timedelta(hours=1),
json.dumps({"status": "running"})
)
executor.submit(background_generate_report, date_no_hyphen)
return {
"message": "processing",
"detail": "报告正在生成中,请稍后再试"
}
except Exception as e:
print(f"Error processing report request: {str(e)}")
import traceback
print(f"详细错误信息: {traceback.format_exc()}")
raise HTTPException(
status_code=500,
detail=f"处理报告时发生错误: {str(e)}\n错误类型: {type(e)}"
)
# 生成每日分析报告的函数
async def generate_daily_report(date: str) -> Dict:
print(f"\n=== 开始生成日报 {date} ===") print(f"\n=== 开始生成日报 {date} ===")
# 定义异常行为列表
ABNORMAL_BEHAVIORS = [
'打架',
'斗殴',
'摔倒',
'晕倒',
'昏倒',
'跌倒',
'滑倒',
'',
'',
'受伤',
'暴力',
'攻击',
'威胁',
'破坏',
'偷窃',
'抢夺',
'游荡',
'徘徊',
'尾随',
'骚扰'
]
# 定义行为类别
BEHAVIOR_CATEGORIES = {
"基础动作": [
"", "站立", "站着",
"", "走路", "散步", "行走", "徒步",
"", "奔跑", "慢跑",
"", "坐下", "坐着",
"", "蹲下", "蹲着",
"", "转身", "转头", "回头", "旋转", "转向", "转弯",
"", "", "", ""
],
"日常生活": [
"", "食用", "吃饭", "吃零食", "吃东西", "用餐", "咀嚼", "",
"喝水", "喝牛奶", "喝茶", "饮用", "喝咖啡", "", "饮水",
"穿衣服", "穿裤子", "穿鞋", "戴帽子", "戴口罩", "戴围巾",
"", "", "睡觉", "休息", "打哈欠",
"洗澡", "刷牙", "洗手", "洗涤", "清洁", "擦洗",
"吃药", "喝药", "服药"
],
"社交活动": [
"说话", "交流", "演讲", "谈话", "聊天", "采访", "社交",
"打麻将", "打牌", "玩手机", "玩电脑", "玩游戏", "赌博",
"", "大笑", "微笑", "哭泣", "咯咯笑", "皱眉"
],
"工作学习": [
"读书", "阅读", "看书",
"写作", "写字", "",
"工作", "学习", "使用电脑", "使用笔记本电脑", "使用手机", "开会", "打字",
"画画", "绘画", "摄影", "素描"
],
"运动娱乐": [
"", "跳跃", "跳舞", "游泳", "运动", "健身", "锻炼"
],
"异常行为": [
'打架',
'斗殴',
'摔倒',
'晕倒',
'昏倒',
'跌倒',
'滑倒',
'',
'',
'受伤',
'暴力',
'攻击',
'威胁',
'破坏',
'偷窃',
'抢夺',
'游荡',
'徘徊',
'尾随',
'骚扰'
],
"其他": ["其他"]
}
# 初始化数据收集结构 # 初始化数据收集结构
data_collection = { data_collection = {
"date": date, "date": date,
@@ -376,34 +112,10 @@ async def generate_daily_report(date: str) -> Dict:
"behavior_distribution": {}, # 行为分布 "behavior_distribution": {}, # 行为分布
"hourly_stats": {}, # 每小时统计 "hourly_stats": {}, # 每小时统计
"category_stats": { # 各类别行为统计 "category_stats": { # 各类别行为统计
"基础动作": { category: {
"count": 0,
"behaviors": {} # 改为以行为为key的统计
},
"日常生活": {
"count": 0, "count": 0,
"behaviors": {} "behaviors": {}
}, } for category in BEHAVIOR_CATEGORIES.keys()
"社交活动": {
"count": 0,
"behaviors": {}
},
"工作学习": {
"count": 0,
"behaviors": {}
},
"运动娱乐": {
"count": 0,
"behaviors": {}
},
"异常行为": {
"count": 0,
"behaviors": {}
},
"其他": {
"count": 0,
"behaviors": {}
}
}, },
"abnormal_stats": { "abnormal_stats": {
"behaviors": [], "behaviors": [],
@@ -451,28 +163,21 @@ async def generate_daily_report(date: str) -> Dict:
if "video_analysis" in video_data: if "video_analysis" in video_data:
analysis = video_data["video_analysis"]["qwen-7B"]["extracted_info"] analysis = video_data["video_analysis"]["qwen-7B"]["extracted_info"]
# 处理行为数据
behaviors = analysis.get("actions", []) behaviors = analysis.get("actions", [])
camera_event_count += len(behaviors) camera_event_count += len(behaviors)
data_collection["total_events"] += len(behaviors) data_collection["total_events"] += len(behaviors)
# 处理环境数据
environment = analysis.get("environment", "") environment = analysis.get("environment", "")
if environment: if environment:
# 如果 environment 是列表,我们需要分别处理每个环境
if isinstance(environment, list): if isinstance(environment, list):
for env in environment: for env in environment:
if isinstance(env, str): # 确保是字符串 if isinstance(env, str):
data_collection["activity_areas"][env] = \ data_collection["activity_areas"][env] = \
data_collection["activity_areas"].get(env, 0) + 1 data_collection["activity_areas"].get(env, 0) + 1
else: elif isinstance(environment, str):
# 如果是字符串,直接处理 data_collection["activity_areas"][environment] = \
if isinstance(environment, str): data_collection["activity_areas"].get(environment, 0) + 1
data_collection["activity_areas"][environment] = \
data_collection["activity_areas"].get(environment, 0) + 1
# 更新每小时统计
if hour_str not in data_collection["hourly_stats"]: if hour_str not in data_collection["hourly_stats"]:
data_collection["hourly_stats"][hour_str] = { data_collection["hourly_stats"][hour_str] = {
"event_count": 0, "event_count": 0,
@@ -482,13 +187,10 @@ async def generate_daily_report(date: str) -> Dict:
data_collection["hourly_stats"][hour_str]["event_count"] += len(behaviors) data_collection["hourly_stats"][hour_str]["event_count"] += len(behaviors)
camera_hourly_counts[camera_id][hour_str] += len(behaviors) camera_hourly_counts[camera_id][hour_str] += len(behaviors)
# 处理每个行为
for behavior in behaviors: for behavior in behaviors:
# 更新行为分布
data_collection["behavior_distribution"][behavior] = \ data_collection["behavior_distribution"][behavior] = \
data_collection["behavior_distribution"].get(behavior, 0) + 1 data_collection["behavior_distribution"].get(behavior, 0) + 1
# 分类统计
behavior_categorized = False behavior_categorized = False
for category, keywords in BEHAVIOR_CATEGORIES.items(): for category, keywords in BEHAVIOR_CATEGORIES.items():
if any(keyword in behavior for keyword in keywords): if any(keyword in behavior for keyword in keywords):
@@ -497,10 +199,9 @@ async def generate_daily_report(date: str) -> Dict:
if behavior not in data_collection["category_stats"][category]["behaviors"]: if behavior not in data_collection["category_stats"][category]["behaviors"]:
data_collection["category_stats"][category]["behaviors"][behavior] = { data_collection["category_stats"][category]["behaviors"][behavior] = {
"count": 0, "count": 0,
"occurrences": {} # 改为使用字典,键为"camera_time"组合 "occurrences": {}
} }
# 使用camera_id和hour_str组合作为唯一键
occurrence_key = f"{camera_id}_{hour_str}" occurrence_key = f"{camera_id}_{hour_str}"
if occurrence_key not in data_collection["category_stats"][category]["behaviors"][behavior]["occurrences"]: if occurrence_key not in data_collection["category_stats"][category]["behaviors"][behavior]["occurrences"]:
data_collection["category_stats"][category]["behaviors"][behavior]["count"] += 1 data_collection["category_stats"][category]["behaviors"][behavior]["count"] += 1
@@ -531,7 +232,6 @@ async def generate_daily_report(date: str) -> Dict:
data_collection["hourly_stats"][hour_str]["categories"]["其他"] += 1 data_collection["hourly_stats"][hour_str]["categories"]["其他"] += 1
# 异常行为检测
if any(abnormal in behavior for abnormal in ABNORMAL_BEHAVIORS): if any(abnormal in behavior for abnormal in ABNORMAL_BEHAVIORS):
occurrence_key = f"{camera_id}_{hour_str}" occurrence_key = f"{camera_id}_{hour_str}"
abnormal_key = f"{behavior}_{occurrence_key}" abnormal_key = f"{behavior}_{occurrence_key}"
@@ -549,12 +249,6 @@ async def generate_daily_report(date: str) -> Dict:
if camera_event_count > 0: if camera_event_count > 0:
data_collection["camera_num"].add(camera_id) data_collection["camera_num"].add(camera_id)
print(f"\n=== 数据收集完成 ===")
print(f"has_any_data: {has_any_data}")
print(f"total_events: {data_collection['total_events']}")
print(f"活跃摄像头: {list(data_collection['camera_num'])}")
# 如果没有数据,提前返回
if len(data_collection["camera_num"]) == 0: if len(data_collection["camera_num"]) == 0:
print("判定为无数据,返回") print("判定为无数据,返回")
return { return {
@@ -563,12 +257,8 @@ async def generate_daily_report(date: str) -> Dict:
"detail": "暂无数据" "detail": "暂无数据"
} }
print("\n开始生成报告...")
# 将摄像头集合转换为列表
data_collection["camera_num"] = list(data_collection["camera_num"]) data_collection["camera_num"] = list(data_collection["camera_num"])
# 计算高峰时段
sorted_hours = sorted( sorted_hours = sorted(
data_collection["hourly_stats"].items(), data_collection["hourly_stats"].items(),
key=lambda x: x[1]["event_count"], key=lambda x: x[1]["event_count"],
@@ -576,7 +266,6 @@ async def generate_daily_report(date: str) -> Dict:
) )
data_collection["peak_hours"] = [hour for hour, _ in sorted_hours[:3]] data_collection["peak_hours"] = [hour for hour, _ in sorted_hours[:3]]
# 准备发送给AI分析的数据
preprocessed_data = { preprocessed_data = {
"日期": data_collection["date"], "日期": data_collection["date"],
"摄像头数量": len(data_collection["camera_num"]), "摄像头数量": len(data_collection["camera_num"]),
@@ -588,10 +277,9 @@ async def generate_daily_report(date: str) -> Dict:
"异常行为统计": data_collection["abnormal_stats"], "异常行为统计": data_collection["abnormal_stats"],
"每小时行为统计": data_collection["hourly_stats"] "每小时行为统计": data_collection["hourly_stats"]
} }
# 调用AI分析
ai_analysis = await analyze_experiment_data(preprocessed_data) ai_analysis = await analyze_experiment_data(preprocessed_data)
# 构造最终报告
final_report = { final_report = {
"整体活动趋势": ai_analysis["整体活动趋势"], "整体活动趋势": ai_analysis["整体活动趋势"],
"高峰时段分析": ai_analysis["高峰时段分析"], "高峰时段分析": ai_analysis["高峰时段分析"],
@@ -600,12 +288,10 @@ async def generate_daily_report(date: str) -> Dict:
"建议": ai_analysis["建议"] "建议": ai_analysis["建议"]
} }
# 添加每个摄像头的详细图表数据
final_report["hourly_distribution"] = [] final_report["hourly_distribution"] = []
for camera_id in camera_hourly_counts: for camera_id in camera_hourly_counts:
# 检查该摄像头是否有活动数据
total_events = sum(camera_hourly_counts[camera_id].values()) total_events = sum(camera_hourly_counts[camera_id].values())
if total_events > 0: # 只添加有活动的摄像头 if total_events > 0:
camera_data = { camera_data = {
"camera_id": camera_id, "camera_id": camera_id,
"data": [] "data": []
@@ -622,13 +308,8 @@ async def generate_daily_report(date: str) -> Dict:
return final_report return final_report
# SiliconFlow API Configuration
client = OpenAI(
base_url="https://api.deepseek.com/v1",
api_key="sk-3027fb3c810b4e17985fa397d41250b9"
)
# 修改函数定义为异步函数
async def analyze_experiment_data(report_info): async def analyze_experiment_data(report_info):
"""使用AI分析实验数据"""
system_prompt = """ system_prompt = """
You are an AI assistant tasked with analyzing data. You are an AI assistant tasked with analyzing data.
Generate a comprehensive analysis report in JSON format. Generate a comprehensive analysis report in JSON format.
@@ -702,7 +383,7 @@ async def analyze_experiment_data(report_info):
""" """
try: try:
response = client.chat.completions.create( response = ai_client.chat.completions.create(
model="deepseek-chat", model="deepseek-chat",
messages=[ messages=[
{"role": "system", "content": system_prompt}, {"role": "system", "content": system_prompt},
@@ -713,81 +394,10 @@ async def analyze_experiment_data(report_info):
response_format={'type': 'json_object'} response_format={'type': 'json_object'}
) )
# 解析AI响应 return json.loads(response.choices[0].message.content)
ai_response = json.loads(response.choices[0].message.content)
return ai_response
except Exception as e: except Exception as e:
raise HTTPException( raise HTTPException(
status_code=500, status_code=500,
detail="AI分析服务暂时不可用,请稍后重试" detail="AI分析服务暂时不可用,请稍后重试"
) )
@app.get("/web/report/download/{date}")
async def download_report(date: str):
"""
下载指定日期的分析报告
:param date: 日期格式为YYYY-MM-DD
"""
try:
# 验证日期格式并转换为无连字符格式
try:
parsed_date = datetime.strptime(date, "%Y-%m-%d")
date_no_hyphen = parsed_date.strftime("%Y%m%d")
except ValueError:
error_msg = "日期格式必须为YYYY-MM-DD(例如:2024-12-31"
print(f"日期格式错误: {error_msg}")
raise HTTPException(
status_code=400,
detail=error_msg
)
report_key = f"report_{date_no_hyphen}"
# 使用report数据库
report_redis = redis_connections["report"]
# 获取报告数据
report_data = report_redis.get(report_key)
if not report_data:
error_msg = f"未找到 {date} 的报告数据,请先生成报告"
print(error_msg)
raise HTTPException(
status_code=404,
detail=error_msg
)
# 解析数据并移除 hourly_distribution 字段
try:
data = json.loads(report_data)
if "hourly_distribution" in data:
del data["hourly_distribution"]
print("成功获取报告数据")
return {
"message": "success",
"data": data
}
except json.JSONDecodeError as je:
error_msg = f"报告数据格式错误: {str(je)}"
print(error_msg)
raise HTTPException(
status_code=500,
detail=error_msg
)
except HTTPException as he:
raise he
except Exception as e:
error_msg = f"处理请求时发生错误: {str(e)}"
print(error_msg)
raise HTTPException(
status_code=500,
detail=error_msg
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=6005)
+34
View File
@@ -0,0 +1,34 @@
# 配置文件 (config.py)
## 功能描述
该文件包含了Web应用所需的各种配置信息,包括Redis数据库连接、行为类别定义和AI模型配置等。
## 主要配置项
### Redis配置
- `REDIS_CONFIG`: Redis服务器连接信息
- `CAMERA_DB_MAPPING`: 摄像头数据库映射关系
- `create_redis_connections()`: 创建Redis连接池的函数
### AI模型配置
- `SILICON_FLOW_CONFIG`: SiliconFlow API配置
- `ai_client`: OpenAI客户端实例
### 行为类别配置
- `BEHAVIOR_CATEGORIES`: 行为分类定义,包括:
- 基础动作
- 日常生活
- 社交活动
- 工作学习
- 运动娱乐
- 异常行为
- 其他
### 缓存配置
- `REDIS_CACHE_CONFIG`: Redis缓存相关配置
- report_expiry: 报告缓存过期时间
- task_status_expiry: 任务状态缓存过期时间
## 依赖
- Redis
- OpenAI
+42
View File
@@ -0,0 +1,42 @@
# Web应用主程序 (main.py)
## 功能描述
FastAPI Web应用的主程序,提供了摄像头数据查询和分析报告生成的API接口。
## 主要组件
### FastAPI应用实例
- CORS中间件配置
- 线程池执行器配置
- Redis连接池初始化
### API端点
#### 人脸数据接口
- `GET /web/face/{camera_id}/data`: 获取摄像头人脸数据
- 参数:camera_id, date(可选)
- 返回:指定日期的人脸识别数据
#### 行为数据接口
- `GET /web/{camera_id}/data`: 获取摄像头行为数据
- 参数:camera_id, date(可选)
- 返回:指定日期的行为分析数据
#### 报告接口
- `GET /web/report/{date}`: 获取分析报告
- 参数:date
- 返回:指定日期的分析报告
- `GET /web/report/download/{date}`: 下载分析报告
- 参数:date
- 返回:报告下载数据
## 工作流程
1. 接收API请求
2. 验证请求参数
3. 从Redis获取数据或生成新报告
4. 返回JSON格式的响应
## 依赖
- FastAPI
- Redis
- ThreadPoolExecutor
+38
View File
@@ -0,0 +1,38 @@
# 数据模型和业务逻辑 (models.py)
## 功能描述
包含了Web应用的核心业务逻辑,主要处理数据获取和报告生成。
## 主要功能
### 数据获取
- `get_camera_data_by_date()`: 获取指定摄像头和日期的数据
- 支持人脸数据和行为数据的获取
- 处理Redis键值匹配和数据解析
### 报告生成
- `background_generate_report()`: 后台生成分析报告
- 异步处理报告生成任务
- 管理任务状态和缓存
- `generate_daily_report()`: 生成每日分析报告
- 收集和处理摄像头数据
- 统计行为和异常事件
- 生成结构化报告
### AI分析
- `analyze_experiment_data()`: 使用AI分析实验数据
- 调用AI模型进行数据分析
- 生成分析报告和建议
## 数据结构
- 行为统计
- 时间分布
- 异常事件记录
- 类别统计
- 活动区域统计
## 依赖
- Redis
- OpenAI
- FastAPI
- asyncio
+33
View File
@@ -0,0 +1,33 @@
# 配置文件 (config.py)
## 功能描述
该文件包含了系统所需的各种配置信息,包括Redis数据库、文件路径、RTSP流和SFTP等配置。
## 主要配置项
### Redis配置
- `REDIS_CONFIG`: Redis服务器连接信息
- `REDIS_DB`: 不同功能对应的Redis数据库映射
- `REDIS_CLIENTS`: 摄像头Redis客户端实例
- `REDIS_IDENTITY`: 身份信息数据库客户端
### 文件路径配置
- `PATH_CONFIG`: 系统使用的各种文件路径配置
- base_dir: 基础目录
- recordings: 视频录制目录
- images: 图片目录
- crop: 裁剪图片目录
- data: 数据目录
### RTSP流配置
- `RTSP_CONFIG`: 摄像头RTSP流配置
- A01: 摄像头A01的配置信息
- B02: 摄像头B02的配置信息
### SFTP配置
- `SFTP_CONFIG`: SFTP服务器连接配置
### 模型路径配置
- `MODEL_CONFIG`: AI模型路径配置
- qwen_path: Qwen模型路径
- yolo_pose_path: YOLO姿态检测模型路径
+31
View File
@@ -0,0 +1,31 @@
# 人脸特征提取 (face-emb.py)
## 功能描述
该模块负责从图片中提取人脸特征向量,并将特征向量保存到Redis数据库中,用于后续的人脸识别。
## 主要组件
### FaceFeatureExtractor类
处理人脸特征提取的主要类。
#### 主要方法
- `__init__()`: 初始化特征提取器和Redis连接
- `get_feature()`: 从单张图片中提取人脸特征向量
- `process_dataset()`: 处理整个数据集并保存特征到Redis
## 工作流程
1. 初始化特征提取器和Redis连接
2. 扫描数据集目录
3. 对每张图片进行人脸检测和特征提取
4. 将提取的特征向量保存到Redis数据库
## 使用说明
1. 确保数据集目录结构正确(每个人的照片放在以其名字命名的子目录中)
2. 运行脚本开始处理数据集
3. 特征向量将以JSON格式存储在Redis中,键为人名
## 依赖
- DeepFace
- Redis
- PIL
- tqdm
+39
View File
@@ -0,0 +1,39 @@
# 人脸分析系统 (face.py)
## 功能描述
该模块实现了实时人脸分析系统,可以处理监控图片中的人脸,进行身份识别和特征分析。
## 主要组件
### FaceAnalysisSystem类
处理人脸分析的核心类。
#### 主要方法
- `get_face_embedding()`: 获取人脸特征向量
- `find_identity()`: 在身份数据库中查找匹配的身份
- `process_new_image()`: 处理新的图片
### ImageMonitor类
监控图片目录的类。
#### 主要方法
- `monitor_directories()`: 监控目录变化
- `process_new_image()`: 处理新图片
- `_get_redis_key()`: 生成Redis键值
## 工作流程
1. 监控指定目录中的新图片
2. 对新图片进行人脸检测和特征提取
3. 在身份数据库中查找匹配的身份
4. 将分析结果保存到Redis数据库
## 使用说明
1. 确保配置文件中的路径和Redis配置正确
2. 运行脚本开始监控图片目录
3. 分析结果将以JSON格式存储在Redis中
## 依赖
- DeepFace
- Redis
- OpenCV
- NumPy
+25
View File
@@ -0,0 +1,25 @@
# 信息配置文件 (info.json)
## 功能描述
该配置文件包含了系统用于行为分析和场景识别的预定义列表。
## 主要配置项
### actions
包含系统可以识别的人类行为列表,例如:
- 基本动作(走、跑、跳等)
- 日常活动(吃饭、喝水等)
- 社交行为(交谈、握手等)
- 工作行为(写字、打电话等)
### environments
包含系统可以识别的环境场景列表,例如:
- 室内场所(办公室、教室等)
- 室外场所(公园、街道等)
- 公共场所(商场、医院等)
- 特定场景(实验室、会议室等)
## 使用说明
- 该配置文件被qwen.py等模块使用
- 用于辅助AI模型进行行为和场景识别
- 可以根据需要扩展或修改列表内容
+31
View File
@@ -0,0 +1,31 @@
# 姿态检测模块 (pose.py)
## 功能描述
该模块使用YOLO模型进行人体姿态检测,并对检测到的人体区域进行裁剪保存。
## 主要组件
### PoseMonitor类
处理姿态检测和图片裁剪的主要类。
#### 主要方法
- `__init__()`: 初始化YOLO模型和路径配置
- `_process_image()`: 处理单张图片,检测人体并保存裁剪结果
- `monitor_directories()`: 监控目录变化
## 工作流程
1. 监控指定目录中的新图片
2. 使用YOLO模型检测图片中的人体
3. 对检测到的人体区域进行裁剪
4. 将裁剪后的图片保存到指定目录
## 使用说明
1. 确保YOLO模型路径配置正确
2. 运行脚本开始监控图片目录
3. 裁剪后的图片将保存在crop目录下
## 依赖
- YOLO
- PIL
- OpenCV
- Ultralytics
+39
View File
@@ -0,0 +1,39 @@
# Qwen视觉分析模块 (qwen.py)
## 功能描述
该模块使用Qwen2-VL-7B模型进行视频和图像的多模态分析,包括场景理解、行为识别和异常检测。
## 主要组件
### MediaAnalysisSystem类
处理媒体分析的核心类。
#### 主要方法
- `process_with_qwen()`: 使用Qwen模型处理媒体
- `extract_info()`: 从模型输出中提取结构化信息
- `_get_analysis_prompt()`: 生成分析提示词
### VideoMonitor类
监控视频文件的类。
#### 主要方法
- `monitor_directories()`: 监控目录变化
- `process_new_video()`: 处理新视频
- `_get_redis_key()`: 生成Redis键值
## 工作流程
1. 监控指定目录中的新视频
2. 使用Qwen模型分析视频内容
3. 提取结构化信息
4. 将分析结果保存到Redis数据库
## 使用说明
1. 确保Qwen模型路径配置正确
2. 运行脚本开始监控视频目录
3. 分析结果将以JSON格式存储在Redis中
## 依赖
- Transformers
- Torch
- Redis
- Decord
+37
View File
@@ -0,0 +1,37 @@
# RTSP流录制模块 (rtsp2video.py)
## 功能描述
该模块负责从RTSP流中录制视频片段,并通过SFTP上传到远程服务器。
## 主要组件
### SFTPClient类
处理SFTP文件上传的类。
#### 主要方法
- `connect()`: 连接SFTP服务器
- `upload_file()`: 上传文件到服务器
### 主要功能
- `record_rtsp_stream()`: 录制RTSP流为视频片段
## 工作流程
1. 连接RTSP流
2. 每隔固定时间录制一段视频
3. 将录制的视频保存到本地
4. 定期通过SFTP上传视频到服务器
## 使用说明
1. 确保RTSP流和SFTP配置正确
2. 运行脚本开始录制视频
3. 视频将按时间戳命名并保存
## 配置项
- 录制时长:10秒
- 录制间隔:120秒
- 上传间隔:600秒
## 依赖
- OpenCV
- Paramiko
- Threading
+29
View File
@@ -0,0 +1,29 @@
# 视频帧提取模块 (video2image.py)
## 功能描述
该模块负责从视频文件中提取第一帧并保存为图片。
## 主要组件
### VideoMonitor类
处理视频帧提取的主要类。
#### 主要方法
- `__init__()`: 初始化路径配置
- `_save_first_frame()`: 从视频中提取并保存第一帧
- `monitor_directories()`: 监控目录变化
## 工作流程
1. 监控指定目录中的新视频文件
2. 对每个新视频提取第一帧
3. 将提取的帧保存为JPG格式
4. 保持目录结构与视频目录一致
## 使用说明
1. 确保路径配置正确
2. 运行脚本开始监控视频目录
3. 提取的帧将保存在images目录下
## 依赖
- PIL
- Decord
+74
View File
@@ -0,0 +1,74 @@
import redis
# Redis配置
REDIS_CONFIG = {
'host': "222.186.10.253",
'port': 6379,
'password': "Obscura@2024"
}
# Redis数据库映射
REDIS_DB = {
'camera_A01': 210, # A01摄像头数据库
'camera_B02': 211, # B02摄像头数据库
'identity': 212, # 身份信息数据库
}
# 初始化Redis客户端
REDIS_CLIENTS = {
'A01': redis.Redis(
host=REDIS_CONFIG['host'],
port=REDIS_CONFIG['port'],
password=REDIS_CONFIG['password'],
db=REDIS_DB['camera_A01']
),
'B02': redis.Redis(
host=REDIS_CONFIG['host'],
port=REDIS_CONFIG['port'],
password=REDIS_CONFIG['password'],
db=REDIS_DB['camera_B02']
)
}
# 身份数据库客户端
REDIS_IDENTITY = redis.Redis(
host=REDIS_CONFIG['host'],
port=REDIS_CONFIG['port'],
password=REDIS_CONFIG['password'],
db=REDIS_DB['identity']
)
# 文件路径配置
PATH_CONFIG = {
'base_dir': "files",
'recordings': "files/recordings", # 视频录制目录
'images': "files/images", # 图片目录
'crop': "files/crop", # 裁剪图片目录
'data': "files/data" # 数据目录
}
# RTSP流配置
RTSP_CONFIG = {
'A01': {
'url': "rtsp://admin:Obscura@2024@192.168.31.196:554//h264/ch1/main/av_stream",
'name': "A01"
},
'B02': {
'url': "rtsp://admin:AWOLDS@192.168.31.181:554//h264/ch1/main/av_stream",
'name': "B02"
}
}
# SFTP配置
SFTP_CONFIG = {
'hostname': "222.186.10.253",
'username': "zydi",
'password': "Obscura@2024",
'remote_base_path': "files/recordings"
}
# 模型路径配置
MODEL_CONFIG = {
'qwen_path': "/obscura/models/qwen/Qwen2-VL-7B-Instruct",
'yolo_pose_path': '/home/zydi/models/yolo11n-pose.pt'
}
+3 -7
View File
@@ -5,18 +5,14 @@ from tqdm import tqdm
import json import json
from deepface import DeepFace from deepface import DeepFace
import redis import redis
from config import REDIS_IDENTITY, PATH_CONFIG
class FaceFeatureExtractor: class FaceFeatureExtractor:
def __init__(self): def __init__(self):
""" """
初始化特征提取器和Redis连接 初始化特征提取器和Redis连接
""" """
self.redis_client = redis.Redis( self.redis_client = REDIS_IDENTITY
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=212
)
def get_feature(self, img_path): def get_feature(self, img_path):
""" """
@@ -105,7 +101,7 @@ class FaceFeatureExtractor:
def main(): def main():
# 设置路径 # 设置路径
DATASET_DIR = "/home/zydi/VLM/data" DATASET_DIR = PATH_CONFIG['data']
# 创建特征提取器 # 创建特征提取器
extractor = FaceFeatureExtractor() extractor = FaceFeatureExtractor()
+4 -21
View File
@@ -10,31 +10,14 @@ from deepface import DeepFace
import numpy as np import numpy as np
import gc import gc
import re import re
from config import REDIS_CLIENTS, REDIS_IDENTITY, PATH_CONFIG
class FaceAnalysisSystem: class FaceAnalysisSystem:
def __init__(self): def __init__(self):
# Redis配置 # Redis配置
self.redis_clients = { self.redis_clients = REDIS_CLIENTS
'A01': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=210
),
'B02': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=211
)
}
# 身份信息数据库 # 身份信息数据库
self.identity_db = redis.Redis( self.identity_db = REDIS_IDENTITY
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=212
)
def get_face_embedding(self, img_path): def get_face_embedding(self, img_path):
"""获取人脸embedding""" """获取人脸embedding"""
@@ -359,7 +342,7 @@ class ImageMonitor:
def main(): def main():
try: try:
images_path = "crop" # 设置crop目录路径 images_path = "files/crop" # 设置crop目录路径
monitor = ImageMonitor(images_path) monitor = ImageMonitor(images_path)
monitor.monitor_directories() monitor.monitor_directories()
+4 -3
View File
@@ -3,12 +3,13 @@ import time
from PIL import Image from PIL import Image
import torch import torch
from ultralytics import YOLO from ultralytics import YOLO
from config import PATH_CONFIG, MODEL_CONFIG
class PoseMonitor: class PoseMonitor:
def __init__(self, images_path): def __init__(self, images_path):
self.images_path = images_path self.images_path = images_path
self.crop_path = "crop" self.crop_path = PATH_CONFIG['crop']
self.model = YOLO('/home/zydi/models/yolo11n-pose.pt') # 加载YOLOv8-pose模型 self.model = YOLO(MODEL_CONFIG['yolo_pose_path'])
# 确保crop目录存在 # 确保crop目录存在
if not os.path.exists(self.crop_path): if not os.path.exists(self.crop_path):
@@ -100,7 +101,7 @@ class PoseMonitor:
def main(): def main():
try: try:
images_path = "images" images_path = PATH_CONFIG['images']
monitor = PoseMonitor(images_path) monitor = PoseMonitor(images_path)
monitor.monitor_directories() monitor.monitor_directories()
+13 -19
View File
@@ -11,9 +11,10 @@ from qwen_vl_utils import process_vision_info
import redis import redis
import time import time
import gc import gc
from config import REDIS_CLIENTS, PATH_CONFIG, MODEL_CONFIG
# 配置 # 配置
QWEN_MODEL_PATH = "/obscura/models/qwen/Qwen2-VL-7B-Instruct" QWEN_MODEL_PATH = MODEL_CONFIG['qwen_path']
# 初始化 Qwen 模型 (使用 cuda:0) # 初始化 Qwen 模型 (使用 cuda:0)
print("正在初始化 Qwen 模型 (cuda:0)...") print("正在初始化 Qwen 模型 (cuda:0)...")
@@ -35,7 +36,11 @@ processor = AutoProcessor.from_pretrained(
def load_config(): def load_config():
"""加载配置文件""" """加载配置文件"""
try: try:
with open('info.json', 'r', encoding='utf-8') as f: # 使用os.path获取info.json的绝对路径
current_dir = os.path.dirname(os.path.abspath(__file__))
config_path = os.path.join(current_dir, 'info.json')
with open(config_path, 'r', encoding='utf-8') as f:
config = json.load(f) config = json.load(f)
return config return config
except Exception as e: except Exception as e:
@@ -51,6 +56,8 @@ class MediaAnalysisSystem:
self.device = "cuda:0" self.device = "cuda:0"
self.qwen_model = model self.qwen_model = model
self.qwen_processor = processor self.qwen_processor = processor
# Redis配置
self.redis_clients = REDIS_CLIENTS
# 使用加载的配置 # 使用加载的配置
self.environments = CONFIG["environments"] self.environments = CONFIG["environments"]
self.actions = CONFIG["actions"] self.actions = CONFIG["actions"]
@@ -290,20 +297,8 @@ class VideoMonitor:
def __init__(self, recordings_path, system): def __init__(self, recordings_path, system):
self.recordings_path = recordings_path self.recordings_path = recordings_path
self.system = system self.system = system
self.redis_clients = { # 使用 system 中的 redis_clients
'A01': redis.Redis( self.redis_clients = system.redis_clients
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=210
),
'B02': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=211
)
}
# 新增:初始化时加载已处理的视频记录 # 新增:初始化时加载已处理的视频记录
self.processed_videos = self._load_processed_videos() self.processed_videos = self._load_processed_videos()
# 新增:异常视频记录 # 新增:异常视频记录
@@ -626,7 +621,7 @@ class VideoMonitor:
def main(): def main():
try: try:
system = MediaAnalysisSystem() system = MediaAnalysisSystem()
recordings_path = "recordings" # 设置recordings目录路径 recordings_path = PATH_CONFIG['recordings'] # 使用配置中的路径
# 创建并启动监控器 # 创建并启动监控器
monitor = VideoMonitor(recordings_path, system) monitor = VideoMonitor(recordings_path, system)
@@ -634,11 +629,10 @@ def main():
except Exception as e: except Exception as e:
print(f"\n未预期的错误: {str(e)}") print(f"\n未预期的错误: {str(e)}")
# 添加:在异常终止时保存错误日志
if 'monitor' in locals(): if 'monitor' in locals():
monitor._save_error_log() monitor._save_error_log()
print("错误日志已保存") print("错误日志已保存")
raise # 重新抛出异常以保持原有的错误追踪信息 raise
if __name__ == "__main__": if __name__ == "__main__":
main() main()
@@ -3,13 +3,14 @@ import time
from datetime import datetime from datetime import datetime
import os import os
import paramiko import paramiko
from config import SFTP_CONFIG, RTSP_CONFIG
class SFTPClient: class SFTPClient:
def __init__(self): def __init__(self):
self.hostname = "222.186.10.253" self.hostname = SFTP_CONFIG['hostname']
self.username = "zydi" self.username = SFTP_CONFIG['username']
self.password = "Obscura@2024" self.password = SFTP_CONFIG['password']
self.remote_base_path = "/home/zydi/WEB/recordings" self.remote_base_path = SFTP_CONFIG['remote_base_path']
self.sftp = None self.sftp = None
self.transport = None self.transport = None
self.connect() self.connect()
@@ -126,14 +127,14 @@ def record_rtsp_stream(rtsp_url, camera_name, start_delay=0):
time.sleep(5) time.sleep(5)
if __name__ == "__main__": if __name__ == "__main__":
rtsp_url1 = "rtsp://admin:Obscura@2024@192.168.31.196:554//h264/ch1/main/av_stream" rtsp_url1 = RTSP_CONFIG['A01']['url']
rtsp_url2 = "rtsp://admin:AWOLDS@192.168.31.181:554//h264/ch1/main/av_stream" rtsp_url2 = RTSP_CONFIG['B02']['url']
from threading import Thread from threading import Thread
# 创建两个线程,使用固定的摄像头名称 # 创建两个线程,使用固定的摄像头名称
t1 = Thread(target=record_rtsp_stream, args=(rtsp_url1, "A01", 0)) t1 = Thread(target=record_rtsp_stream, args=(rtsp_url1, RTSP_CONFIG['A01']['name'], 0))
t2 = Thread(target=record_rtsp_stream, args=(rtsp_url2, "B02", 60)) t2 = Thread(target=record_rtsp_stream, args=(rtsp_url2, RTSP_CONFIG['B02']['name'], 60))
t1.start() t1.start()
t2.start() t2.start()
@@ -1,11 +1,12 @@
import os import os
from PIL import Image from PIL import Image
from decord import VideoReader from decord import VideoReader
from config import PATH_CONFIG
class VideoMonitor: class VideoMonitor:
def __init__(self, recordings_path): def __init__(self, recordings_path):
self.recordings_path = recordings_path self.recordings_path = recordings_path
self.images_path = "/home/zydi/WEB/images" self.images_path = PATH_CONFIG['images']
# 确保images目录存在 # 确保images目录存在
if not os.path.exists(self.images_path): if not os.path.exists(self.images_path):
@@ -79,7 +80,7 @@ class VideoMonitor:
def main(): def main():
try: try:
recordings_path = "recordings" recordings_path = PATH_CONFIG['recordings']
monitor = VideoMonitor(recordings_path) monitor = VideoMonitor(recordings_path)
monitor.monitor_directories() monitor.monitor_directories()
-106
View File
@@ -1,106 +0,0 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>动作颜色可视化 - 按类别分组</title>
<style>
body {
font-family: sans-serif;
display: flex;
flex-wrap: wrap;
justify-content: center;
}
.category {
margin: 20px;
border: 1px solid #ddd;
padding: 10px;
border-radius: 5px;
box-shadow: 2px 2px 5px rgba(0,0,0,0.1);
}
.category-title {
font-weight: bold;
margin-bottom: 10px;
text-align: center;
}
.color-box {
width: 120px;
height: 80px;
margin: 5px;
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
border-radius: 5px;
font-size: 12px;
color: white;
overflow: hidden; /* Added to prevent text overflow */
word-break: break-all; /* Added to break long words */
}
@media (max-width: 768px) {
.color-box {
width: 100px;
font-size: 10px;
}
}
</style>
</head>
<body>
<script>
const actionColors = {
"站立类": "#2962FF", "行走类": "#3F51B5", "奔跑类": "#138FFF", "坐卧类": "#82B1FF", "蹲类": "#42A5F5", "转动类": "#90CAF9", "感知类": "#4DD0E1",
"饮食类": "#009688", "喝水": "#CDDC39", "穿戴类": "#8BC34A", "休息类": "#81C784", "清洁类": "#A5D6A7", "医疗类": "#C8E6C9",
"交际类": "#FFA500", "娱乐类": "#F57C00", "情感表达": "#FFD145",
"阅读类": "#EE7AE9", "写作类": "#F635D1", "工作类": "#FF7CB4", "创作类": "#FFB6C1",
"运动类": "#9966FF",
"异常行为": "#EE0000"
};
const categories = {
"基础动作": ["站立类", "行走类", "奔跑类", "坐卧类", "蹲类", "转动类", "感知类"],
"日常生活": ["饮食类", "喝水", "穿戴类", "休息类", "清洁类", "医疗类"],
"社交活动": ["交际类", "娱乐类", "情感表达"],
"工作学习": ["阅读类", "写作类", "工作类", "创作类"],
"运动娱乐": ["运动类"],
"异常行为": ["异常行为"]
};
const fallbackColors = ['#F6DE00', '#8470FF', '#00BCD4', '#F4A460', '#607D8B', '#4CAF50'];
// Add random category
const randomCategoryName = "随机动作";
categories[randomCategoryName] = [];
for (let i = 0; i < fallbackColors.length; i++) {
categories[randomCategoryName].push(`随机动作${i+1}`);
}
for (const category in categories) {
const categoryDiv = document.createElement('div');
categoryDiv.className = 'category';
const titleDiv = document.createElement('div');
titleDiv.className = 'category-title';
titleDiv.textContent = category;
categoryDiv.appendChild(titleDiv);
categories[category].forEach((action, index) => { // Add index to forEach
const colorBox = document.createElement('div');
colorBox.className = 'color-box';
const color = actionColors[action] || (category === randomCategoryName ? fallbackColors[index] : fallbackColors[Math.floor(Math.random() * fallbackColors.length)]); // Use index for fallbackColors
colorBox.style.backgroundColor = color;
colorBox.textContent = `${action} ${color}`;
categoryDiv.appendChild(colorBox);
});
document.body.appendChild(categoryDiv);
}
</script>
</body>
</html>
-975
View File
@@ -1,975 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"# 读取 JSON 文件\n",
"def remove_original_answer(input_file, output_file):\n",
" # 读取 JSON 文件\n",
" with open(input_file, 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
" \n",
" # 遍历所有视频片段\n",
" for video_key in data:\n",
" # 如果存在 original_answer 字段,则删除它\n",
" if 'original_answer' in data[video_key]:\n",
" del data[video_key]['original_answer']\n",
" \n",
" # 将修改后的数据写入新文件\n",
" with open(output_file, 'w', encoding='utf-8') as f:\n",
" json.dump(data, f, ensure_ascii=False, indent=2)\n",
"\n",
"# 使用示例\n",
"input_file = '球机沙发正面.json' # 输入文件名\n",
"output_file = '球机沙发正面_cleaned.json' # 输出文件名\n",
"\n",
"remove_original_answer(input_file, output_file)\n",
"print(f\"已成功删除所有 original_answer 字段,并保存到 {output_file}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"# 需要处理的文件列表\n",
"files = [\n",
" \"result/qwen/球机沙发正面_20241231_0200.json\",\n",
" \"result/qwen/左侧吃饭2_20241231_1612.json\",\n",
" \"result/qwen/左侧吃饭1_20241231_1423.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2300.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2010.json\",\n",
" \"result/qwen/球机椅子左侧面_20241231_0923.json\",\n",
" \"result/qwen/右上角吃饭_20241231_1232.json\"\n",
"]\n",
"\n",
"def remove_actions(data):\n",
" \"\"\"递归删除字典中的 'actions' 字段\"\"\"\n",
" if isinstance(data, dict):\n",
" if 'actions' in data:\n",
" data['actions'] = [] # 清空 actions 列表\n",
" for value in data.values():\n",
" remove_actions(value)\n",
" elif isinstance(data, list):\n",
" for item in data:\n",
" remove_actions(item)\n",
"\n",
"# 处理每个文件\n",
"for file_path in files:\n",
" try:\n",
" # 读取 JSON 文件\n",
" with open(file_path, 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
" \n",
" # 删除 actions 字段内容\n",
" remove_actions(data)\n",
" \n",
" # 写回文件\n",
" with open(file_path, 'w', encoding='utf-8') as f:\n",
" json.dump(data, f, ensure_ascii=False, indent=2)\n",
" \n",
" print(f\"Successfully processed: {file_path}\")\n",
" except Exception as e:\n",
" print(f\"Error processing {file_path}: {str(e)}\")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Loaded 404 actions from info.json\n",
"\n",
"Processing result/qwen/球机沙发正面_20241231_0200.json...\n",
"Found actions in 20241230_124643.avi: ['写作', '看', '写字', '阅读', '写', '坐', '起身']\n",
"Found actions in 20241230_124656.avi: ['看', '整理', '关闭']\n",
"Found actions in 20241230_124646.avi: ['躺', '阅读', '看']\n",
"Found actions in 20241230_124647.avi: ['坐', '躺', '休息', '看']\n",
"Found actions in 20241230_124645.avi: ['坐', '写', '站', '看']\n",
"Found actions in 20241230_124651.avi: ['关闭', '看', '睡觉', '睡', '休息', '躺']\n",
"Found actions in 20241230_124640.avi: ['坐', '休息', '阅读', '看']\n",
"Found actions in 20241230_124649.avi: ['整理', '看', '睡觉', '休息', '睡', '躺']\n",
"Found actions in 20241230_124650.avi: ['看', '睡觉', '睡', '休息', '躺']\n",
"Found actions in 20241230_124644.avi: ['坐', '工作', '操作', '看']\n",
"Found actions in 20241230_124653.avi: ['喝水', '看', '休息', '坐', '喝']\n",
"Found actions in 20241230_124654.avi: ['坐', '走', '站', '看']\n",
"Found actions in 20241230_124637.avi: ['坐', '休息', '使用手机', '看']\n",
"Found actions in 20241230_124638.avi: ['站', '看', '开始', '阅读', '玩手机', '坐']\n",
"Found actions in 20241230_124639.avi: ['看', '打开', '休息', '阅读', '坐']\n",
"Found actions in 20241230_124642.avi: ['写作', '看', '打开', '看书', '阅读', '学习', '工作', '写', '坐']\n",
"Found actions in 20241230_124636.avi: ['整理', '进入', '看', '关闭', '使用手机', '坐']\n",
"Found actions in 20241230_124648.avi: ['睡觉', '睡', '休息', '躺']\n",
"Found actions in 20241230_124652.avi: ['看', '睡觉', '睡', '休息', '躺']\n",
"Found actions in 20241230_124641.avi: ['写作', '吹', '打开', '看', '操作', '学习', '工作', '写', '使用笔记本电脑', '坐']\n",
"Updated result/qwen/球机沙发正面_20241231_0200.json\n",
"\n",
"Processing result/qwen/左侧吃饭2_20241231_1612.json...\n",
"Found actions in 20241230_130857.avi: ['工作', '咀嚼', '看']\n",
"Found actions in 20241230_130853.avi: ['喝饮料', '看', '工作', '坐', '喝']\n",
"Found actions in 20241230_130840.avi: ['吃东西', '看', '操作', '使用电脑', '工作', '动手', '坐', '吃']\n",
"Found actions in 20241230_130844.avi: ['吃东西', '看', '咬', '工作', '坐', '吃']\n",
"Found actions in 20241230_130848.avi: ['吃东西', '喝水', '看', '坐', '喝', '吃']\n",
"Found actions in 20241230_130835.avi: ['吃东西', '看', '戴眼镜', '操作', '使用电脑', '工作', '坐', '吃']\n",
"Updated result/qwen/左侧吃饭2_20241231_1612.json\n",
"\n",
"Processing result/qwen/左侧吃饭1_20241231_1423.json...\n",
"Found actions in 20241230_130812.avi: ['吃东西', '看', '戴眼镜', '工作', '使用手机', '吃']\n",
"Found actions in 20241230_130759.avi: ['站', '喝水', '站立', '走', '休息', '使用电脑', '工作', '坐', '喝', '起身']\n",
"Found actions in 20241230_130830.avi: ['吃东西', '看', '操作', '使用电脑', '工作', '吃']\n",
"Found actions in 20241230_130804.avi: ['坐下', '进入', '看', '打开', '开始', '操作', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_130826.avi: ['吃东西', '看', '休息', '工作', '坐', '咀嚼', '吃']\n",
"Found actions in 20241230_130835.avi: ['吃东西', '看', '戴眼镜', '使用电脑', '工作', '坐', '吃']\n",
"Found actions in 20241230_130808.avi: ['看', '打开', '走', '使用电脑', '工作']\n",
"Found actions in 20241230_130821.avi: ['看', '学习', '工作', '用餐', '坐', '咀嚼']\n",
"Found actions in 20241230_130817.avi: ['坐', '使用手机', '操作']\n",
"Updated result/qwen/左侧吃饭1_20241231_1423.json\n",
"\n",
"Processing result/qwen/室内右上角全景_20241231_2300.json...\n",
"Found actions in 20241230_124417.avi: ['进入', '整理', '阅读', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123344.avi: ['看']\n",
"Found actions in 20241230_123057.avi: ['行走', '坐下', '站', '站立', '走', '学习', '工作', '坐']\n",
"Found actions in 20241230_123606.avi: ['坐', '闻', '使用手机', '看']\n",
"Found actions in 20241230_123518.avi: ['行走', '站', '看', '打开', '站立', '走', '工作', '转', '转身']\n",
"Found actions in 20241230_124047.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123734.avi: ['坐', '闻', '使用手机']\n",
"Found actions in 20241230_123316.avi: ['看']\n",
"Found actions in 20241230_124519.avi: ['坐', '工作', '阅读', '看']\n",
"Found actions in 20241230_123219.avi: ['行走', '站', '交谈', '站立', '走', '交流']\n",
"Found actions in 20241230_124201.avi: ['坐', '工作', '使用电脑', '关闭']\n",
"Found actions in 20241230_124247.avi: ['站', '站立', '进入', '弯腰']\n",
"Found actions in 20241230_124106.avi: ['看', '关闭', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123803.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_124315.avi: ['整理', '弯腰']\n",
"Found actions in 20241230_123200.avi: ['写', '站', '整理', '看']\n",
"Found actions in 20241230_123402.avi: ['进入', '走', '关闭']\n",
"Found actions in 20241230_124528.avi: ['坐下', '使用电脑', '工作', '坐', '起身']\n",
"Found actions in 20241230_123528.avi: ['站', '看', '站立', '转', '转身']\n",
"Found actions in 20241230_123048.avi: ['坐下', '站', '喝水', '看', '走', '工作', '闻', '坐', '喝']\n",
"Found actions in 20241230_123039.avi: ['站', '喝水', '开始', '站立', '喝']\n",
"Found actions in 20241230_123537.avi: ['坐下', '站', '看', '站立', '走', '使用电脑', '工作', '坐', '起身']\n",
"Found actions in 20241230_124408.avi: ['行走', '坐下', '站', '站立', '走', '工作', '坐']\n",
"Found actions in 20241230_123325.avi: ['工作', '看']\n",
"Found actions in 20241230_123411.avi: ['站', '进入', '看', '开始', '走', '工作']\n",
"Found actions in 20241230_123901.avi: ['坐', '闻', '看', '使用电脑']\n",
"Found actions in 20241230_124027.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123948.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_124426.avi: ['整理', '看', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123725.avi: ['坐', '使用手机', '工作']\n",
"Found actions in 20241230_124116.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_122946.avi: ['进入', '关闭']\n",
"Found actions in 20241230_123124.avi: ['行走', '进入', '看', '打开', '走', '开门']\n",
"Found actions in 20241230_124341.avi: ['坐', '阅读', '看']\n",
"Found actions in 20241230_124056.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_124210.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_124554.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123151.avi: ['走', '整理', '弯腰']\n",
"Found actions in 20241230_123557.avi: ['坐下', '站', '关闭', '开始', '站立', '使用电脑', '坐']\n",
"Found actions in 20241230_123929.avi: ['坐', '闻', '使用电脑']\n",
"Found actions in 20241230_123142.avi: ['站', '进入', '看', '打开', '使用电脑', '工作']\n",
"Found actions in 20241230_124008.avi: ['坐', '闻', '使用电脑']\n",
"Found actions in 20241230_123508.avi: ['行走', '站', '进入', '关闭', '走', '闻']\n",
"Found actions in 20241230_124125.avi: ['坐', '工作', '操作', '看']\n",
"Found actions in 20241230_123851.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_123635.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_123229.avi: ['行走', '整理', '看', '走', '写']\n",
"Found actions in 20241230_124435.avi: ['坐', '工作', '阅读', '看']\n",
"Found actions in 20241230_123133.avi: ['进入', '看', '打开', '关闭', '工作', '写']\n",
"Found actions in 20241230_124332.avi: ['坐', '看', '阅读', '关闭']\n",
"Found actions in 20241230_123420.avi: ['行走', '站', '整理', '看', '测量', '走', '站立', '工作']\n",
"Found actions in 20241230_123615.avi: ['坐', '闻', '使用电脑']\n",
"Found actions in 20241230_123353.avi: ['看']\n",
"Found actions in 20241230_124306.avi: ['行走', '蹲下', '走', '操作', '工作', '蹲']\n",
"Found actions in 20241230_123257.avi: ['打开', '关闭']\n",
"Found actions in 20241230_123115.avi: ['坐', '工作', '起身', '使用电脑']\n",
"Found actions in 20241230_123753.avi: ['坐', '闻', '看', '使用电脑']\n",
"Found actions in 20241230_124037.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123106.avi: ['行走', '关闭', '打开', '看', '走', '写']\n",
"Found actions in 20241230_122909.avi: ['进入', '转', '转身']\n",
"Updated result/qwen/室内右上角全景_20241231_2300.json\n",
"\n",
"Processing result/qwen/室内右上角全景_20241231_2010.json...\n",
"Found actions in 20241230_123744.avi: ['坐', '使用手机', '看']\n",
"Found actions in 20241230_124143.avi: ['看', '关闭', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123004.avi: ['行走', '站', '整理', '看', '关闭', '走']\n",
"Found actions in 20241230_124501.avi: ['进入', '看', '操作', '工作', '坐']\n",
"Found actions in 20241230_124219.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123021.avi: ['行走', '走', '整理', '看']\n",
"Found actions in 20241230_124152.avi: ['进入', '使用电脑', '工作', '闻', '坐']\n",
"Found actions in 20241230_123655.avi: ['坐', '使用手机', '工作']\n",
"Found actions in 20241230_124452.avi: ['坐', '使用手机', '看']\n",
"Found actions in 20241230_122936.avi: ['行走', '进入', '关闭', '看', '走', '转', '转身']\n",
"Found actions in 20241230_123705.avi: ['坐', '使用手机', '工作']\n",
"Found actions in 20241230_123247.avi: ['工作', '走', '站', '看']\n",
"Found actions in 20241230_123449.avi: ['走', '整理', '起身', '弯腰']\n",
"Found actions in 20241230_122918.avi: ['搬运', '写', '看']\n",
"Found actions in 20241230_124227.avi: ['行走', '站', '站立', '走', '使用电脑', '坐', '起身']\n",
"Found actions in 20241230_124017.avi: ['站', '站立', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_122927.avi: ['看', '休息', '操作', '工作', '写', '弯腰', '起身']\n",
"Found actions in 20241230_123920.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123832.avi: ['坐', '休息', '工作', '使用电脑']\n",
"Found actions in 20241230_124546.avi: ['站', '看', '站立', '使用电脑', '工作', '坐', '起身']\n",
"Found actions in 20241230_124537.avi: ['看', '使用电脑', '工作', '闻', '坐']\n",
"Found actions in 20241230_123813.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_123440.avi: ['进入', '走', '转', '转身']\n",
"Found actions in 20241230_123210.avi: ['整理', '看', '打开', '关闭', '工作', '清理']\n",
"Found actions in 20241230_123306.avi: ['工作', '看']\n",
"Found actions in 20241230_124443.avi: ['坐', '弹奏', '看']\n",
"Found actions in 20241230_122859.avi: ['站', '看', '走', '操作', '工作', '交流']\n",
"Found actions in 20241230_123938.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_124359.avi: ['整理', '弯腰']\n",
"Found actions in 20241230_123334.avi: ['工作', '看']\n",
"Found actions in 20241230_124603.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_124510.avi: ['进入', '整理', '看', '阅读', '工作', '坐']\n",
"Found actions in 20241230_123822.avi: ['坐', '使用手机']\n",
"Found actions in 20241230_123012.avi: ['站', '看', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123715.avi: ['坐', '使用手机', '看']\n",
"Found actions in 20241230_122955.avi: ['工作', '开始', '整理', '使用电脑']\n",
"Found actions in 20241230_124134.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_123842.avi: ['坐', '看', '使用电脑']\n",
"Found actions in 20241230_124238.avi: ['讨论', '整理', '进入', '走', '工作', '交流']\n",
"Found actions in 20241230_123030.avi: ['进入', '看', '走', '闻', '交流']\n",
"Found actions in 20241230_123644.avi: ['坐', '使用手机', '关闭']\n",
"Found actions in 20241230_123911.avi: ['进入', '看', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_123547.avi: ['讨论', '站', '整理', '看', '交谈', '工作']\n",
"Found actions in 20241230_124350.avi: ['整理', '弯腰']\n",
"Found actions in 20241230_123625.avi: ['坐', '使用手机', '关闭']\n",
"Found actions in 20241230_123958.avi: ['坐', '闻', '使用电脑']\n",
"Found actions in 20241230_124256.avi: ['看', '走', '工作', '弯腰', '起身']\n",
"Found actions in 20241230_124323.avi: ['休息', '整理', '弯腰', '看']\n",
"Found actions in 20241230_123430.avi: ['行走', '坐下', '休息', '走', '工作', '坐']\n",
"Found actions in 20241230_123459.avi: ['行走', '站', '看', '走', '写']\n",
"Found actions in 20241230_123238.avi: ['行走', '站', '看', '走', '操作', '工作']\n",
"Updated result/qwen/室内右上角全景_20241231_2010.json\n",
"\n",
"Processing result/qwen/球机椅子左侧面_20241231_0923.json...\n",
"Found actions in 20241230_124611.avi: ['看', '休息', '戴眼镜', '工作', '使用手机', '写', '坐']\n",
"Found actions in 20241230_124626.avi: ['行走', '走', '工作', '写', '转', '转身']\n",
"Found actions in 20241230_124635.avi: ['站', '看', '站立', '走', '使用电脑', '工作']\n",
"Found actions in 20241230_124618.avi: ['看', '戴眼镜', '阅读', '学习', '工作']\n",
"Found actions in 20241230_124630.avi: ['站', '看', '笑', '工作', '转', '转身']\n",
"Found actions in 20241230_124627.avi: ['手写', '站', '站立', '写', '转', '转身']\n",
"Found actions in 20241230_124617.avi: ['看', '戴眼镜', '阅读', '学习', '工作', '闻', '坐']\n",
"Found actions in 20241230_124613.avi: ['工作', '使用手机', '戴眼镜', '看']\n",
"Found actions in 20241230_124623.avi: ['站', '喝水', '看', '站立', '戴眼镜', '休息', '工作', '喝', '转', '转身']\n",
"Found actions in 20241230_124612.avi: ['看', '操作', '学习', '工作', '使用手机', '坐']\n",
"Found actions in 20241230_124620.avi: ['看', '转头', '使用电脑', '学习', '工作', '转向', '坐', '转']\n",
"Found actions in 20241230_124610.avi: ['坐', '使用手机', '工作', '看']\n",
"Found actions in 20241230_124629.avi: ['行走', '进入', '看', '走', '工作', '交流']\n",
"Found actions in 20241230_124631.avi: ['行走', '看', '走', '工作', '转', '转身']\n",
"Found actions in 20241230_124632.avi: ['站', '看', '站立', '操作', '转', '转身']\n",
"Found actions in 20241230_124622.avi: ['喝水', '看', '休息', '使用电脑', '工作', '坐', '喝']\n",
"Found actions in 20241230_124614.avi: ['看', '开始', '阅读', '学习', '工作', '使用手机', '坐']\n",
"Found actions in 20241230_124616.avi: ['看', '转头', '阅读', '学习', '工作', '转向', '坐', '转', '转身']\n",
"Found actions in 20241230_124634.avi: ['坐', '工作', '写', '看']\n",
"Found actions in 20241230_124621.avi: ['喝水', '看', '休息', '操作', '使用电脑', '工作', '动手', '坐', '喝']\n",
"Found actions in 20241230_124628.avi: ['行走', '站', '看', '走', '操作', '工作', '转向', '转', '转身']\n",
"Found actions in 20241230_124633.avi: ['站', '看', '站立', '学习', '工作', '弯腰']\n",
"Found actions in 20241230_124624.avi: ['站', '看', '走', '坐', '起身']\n",
"Found actions in 20241230_124615.avi: ['坐', '阅读', '看']\n",
"Found actions in 20241230_124619.avi: ['站', '看', '站立', '使用电脑', '转', '起身', '转身']\n",
"Updated result/qwen/球机椅子左侧面_20241231_0923.json\n",
"\n",
"Processing result/qwen/右上角吃饭_20241231_1232.json...\n",
"Found actions in 20241230_130634.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130731.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130653.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130658.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130620.avi: ['坐', '工作']\n",
"Found actions in 20241230_130712.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130745.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130726.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130551.avi: ['行走', '进入', '休息', '走', '操作', '工作', '服务']\n",
"Found actions in 20241230_130741.avi: ['坐', '组装', '工作', '使用电脑']\n",
"Found actions in 20241230_130639.avi: ['休息', '使用电脑', '工作', '饮用', '坐']\n",
"Found actions in 20241230_130644.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130707.avi: ['坐', '闻', '服务', '使用电脑']\n",
"Found actions in 20241230_130555.avi: ['看', '交谈', '休息', '走', '使用电脑', '工作', '交流', '坐']\n",
"Found actions in 20241230_130629.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130624.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130605.avi: ['看', '操作', '使用电脑', '工作', '坐']\n",
"Found actions in 20241230_130610.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130703.avi: ['坐', '工作', '服务', '使用电脑']\n",
"Found actions in 20241230_130736.avi: ['吃东西', '使用电脑', '吃零食', '工作', '坐', '吃']\n",
"Found actions in 20241230_130648.avi: ['喝水', '看', '休息', '使用电脑', '工作', '坐', '喝']\n",
"Found actions in 20241230_130755.avi: ['坐', '组装', '工作', '使用电脑']\n",
"Found actions in 20241230_130750.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130546.avi: ['坐', '工作', '起身', '看']\n",
"Found actions in 20241230_130717.avi: ['坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130615.avi: ['使用电脑', '学习', '工作', '闻', '坐']\n",
"Found actions in 20241230_130722.avi: ['看', '坐', '工作', '使用电脑']\n",
"Found actions in 20241230_130600.avi: ['坐下', '站', '看', '站立', '工作', '坐']\n",
"Updated result/qwen/右上角吃饭_20241231_1232.json\n"
]
}
],
"source": [
"import json\n",
"\n",
"# 读取动作列表\n",
"def load_actions():\n",
" with open('info.json', 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
" return set(data['actions']) # 使用集合提高查找效率\n",
"\n",
"def find_actions(text, action_set):\n",
" \"\"\"在文本中查找动作\"\"\"\n",
" found_actions = set()\n",
" for action in action_set:\n",
" # 直接使用简单的字符串匹配\n",
" if action in text:\n",
" found_actions.add(action)\n",
" return list(found_actions)\n",
"\n",
"def process_file(file_path, action_set):\n",
" \"\"\"处理单个文件\"\"\"\n",
" modified = False\n",
" with open(file_path, 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
" \n",
" # 遍历所有视频分析\n",
" for video_key, video_data in data.items():\n",
" if 'video_analysis' in video_data:\n",
" analysis = video_data['video_analysis']\n",
" if 'qwen-7B' in analysis:\n",
" qwen_data = analysis['qwen-7B']\n",
" if 'original_answer' in qwen_data and 'extracted_info' in qwen_data:\n",
" # 查找动作\n",
" found_actions = find_actions(qwen_data['original_answer'], action_set)\n",
" if found_actions: # 只有找到动作时才更新\n",
" qwen_data['extracted_info']['actions'] = found_actions\n",
" modified = True\n",
" print(f\"Found actions in {video_key}: {found_actions}\")\n",
" \n",
" # 只有在发现动作时才写回文件\n",
" if modified:\n",
" with open(file_path, 'w', encoding='utf-8') as f:\n",
" json.dump(data, f, ensure_ascii=False, indent=2)\n",
" print(f\"Updated {file_path}\")\n",
" else:\n",
" print(f\"No actions found in {file_path}\")\n",
"\n",
"# 主程序\n",
"def main():\n",
" # 需要处理的文件列表\n",
" files = [\n",
" \"result/qwen/球机沙发正面_20241231_0200.json\",\n",
" \"result/qwen/左侧吃饭2_20241231_1612.json\",\n",
" \"result/qwen/左侧吃饭1_20241231_1423.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2300.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2010.json\",\n",
" \"result/qwen/球机椅子左侧面_20241231_0923.json\",\n",
" \"result/qwen/右上角吃饭_20241231_1232.json\"\n",
" ]\n",
" \n",
" # 加载动作列表\n",
" action_set = load_actions()\n",
" print(f\"Loaded {len(action_set)} actions from info.json\")\n",
" \n",
" # 处理每个文件\n",
" for file_path in files:\n",
" try:\n",
" print(f\"\\nProcessing {file_path}...\")\n",
" process_file(file_path, action_set)\n",
" except Exception as e:\n",
" print(f\"Error processing {file_path}: {str(e)}\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"from datetime import datetime\n",
"\n",
"def calculate_accuracy(predicted_file, ground_truth_file):\n",
" \"\"\"计算模型预测结果与人工标注的准确度\"\"\"\n",
" accuracy_metrics = {\n",
" \"environment\": 0,\n",
" \"num_people\": 0,\n",
" \"actions\": 0,\n",
" \"objects\": 0,\n",
" \"furniture\": 0,\n",
" \"emotions\": 0,\n",
" \"features\": 0\n",
" }\n",
" total_files = 0\n",
" \n",
" # 加载文件\n",
" try:\n",
" with open(predicted_file, 'r', encoding='utf-8') as f:\n",
" predicted_data = json.load(f)\n",
" with open(ground_truth_file, 'r', encoding='utf-8') as f:\n",
" ground_truth_data = json.load(f)\n",
" except Exception as e:\n",
" print(f\"读取文件时出错: {str(e)}\")\n",
" return accuracy_metrics\n",
" \n",
" print(f\"\\n开始计算准确率...\")\n",
" print(f\"预测结果包含 {len(predicted_data)} 个文件\")\n",
" print(f\"标注数据包含 {len(ground_truth_data)} 个文件\")\n",
" \n",
" # 对每个预测结果进行评估\n",
" for video_name, pred in predicted_data.items():\n",
" print(f\"\\n处理文件: {video_name}\")\n",
" \n",
" # 检查文件是否在标注数据中\n",
" if video_name not in ground_truth_data:\n",
" print(f\"标注数据中未找到: {video_name}\")\n",
" continue\n",
" \n",
" try:\n",
" # 获取预测信息\n",
" pred_info = pred[\"video_analysis\"][\"qwen\"][\"extracted_info\"]\n",
" gt_info = ground_truth_data[video_name][\"extracted_info\"]\n",
" \n",
" total_files += 1\n",
" print(f\"成功匹配文件: {video_name}\")\n",
" \n",
" # 环境匹配\n",
" if pred_info[\"environment\"] == gt_info[\"environment\"]:\n",
" accuracy_metrics[\"environment\"] += 1\n",
" print(f\"环境匹配成功: {pred_info['environment']}\")\n",
" \n",
" # 人数匹配\n",
" if pred_info[\"num_people\"] == gt_info[\"num_people\"]:\n",
" accuracy_metrics[\"num_people\"] += 1\n",
" print(f\"人数匹配成功: {pred_info['num_people']}\")\n",
" \n",
" # 计算列表类字段的匹配度(使用Jaccard相似度)\n",
" field_mapping = {\n",
" \"actions\": \"actions\",\n",
" \"objects\": \"objects\",\n",
" \"furniture\": \"furniture\",\n",
" \"emotions\": \"emotions\",\n",
" \"features\": \"feature \" # 注意标注数据中的 \"feature \" 有空格\n",
" }\n",
" \n",
" for pred_field, gt_field in field_mapping.items():\n",
" pred_set = set(pred_info[pred_field]) if pred_info[pred_field] else set()\n",
" gt_set = set(gt_info[gt_field]) if gt_field in gt_info and gt_info[gt_field] else set()\n",
" \n",
" if pred_set or gt_set: # 避免除零错误\n",
" intersection = len(pred_set & gt_set)\n",
" union = len(pred_set | gt_set)\n",
" jaccard = intersection / union\n",
" accuracy_metrics[pred_field] += jaccard\n",
" print(f\"{pred_field} 匹配度: {jaccard:.2%}\")\n",
" print(f\"预测集合: {pred_set}\")\n",
" print(f\"标注集合: {gt_set}\")\n",
" print(f\"交集数量: {intersection}\")\n",
" print(f\"并集数量: {union}\")\n",
" \n",
" except Exception as e:\n",
" print(f\"处理 {video_name} 时出错: {str(e)}\")\n",
" continue\n",
" \n",
" # 计算平均准确率\n",
" if total_files > 0:\n",
" print(f\"\\n共成功比较 {total_files} 个文件\")\n",
" for key in accuracy_metrics:\n",
" accuracy_metrics[key] = round(accuracy_metrics[key] / total_files * 100, 2)\n",
" else:\n",
" print(\"\\n警告:没有成功比较任何文件\")\n",
" \n",
" return accuracy_metrics\n",
"\n",
"def main():\n",
" try:\n",
" # 获取输入文件路径\n",
" predicted_file = input(\"请输入预测结果文件路径: \").strip()\n",
" ground_truth_file = input(\"请输入标注文件路径: \").strip()\n",
" output_path = input(\"请输入结果保存路径 (直接回车使用当前目录): \").strip()\n",
" \n",
" # 验证文件是否存在\n",
" if not os.path.exists(predicted_file):\n",
" raise Exception(f\"错误:预测结果文件 '{predicted_file}' 不存在\")\n",
" if not os.path.exists(ground_truth_file):\n",
" raise Exception(f\"错误:标注文件 '{ground_truth_file}' 不存在\")\n",
" \n",
" # 设置输出路径\n",
" output_path = output_path if output_path else os.getcwd()\n",
" if not os.path.exists(output_path):\n",
" os.makedirs(output_path)\n",
" \n",
" # 计算准确率\n",
" accuracy = calculate_accuracy(predicted_file, ground_truth_file)\n",
" \n",
" # 显示准确率结果\n",
" print(\"\\n准确率评估结果:\")\n",
" for metric, value in accuracy.items():\n",
" print(f\"{metric}: {value}%\")\n",
" \n",
" # 保存准确率结果\n",
" timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n",
" accuracy_file = os.path.join(output_path, f\"accuracy_results_{timestamp}.json\")\n",
" with open(accuracy_file, 'w', encoding='utf-8') as f:\n",
" json.dump(accuracy, f, ensure_ascii=False, indent=2)\n",
" print(f\"\\n准确率评估结果已保存到: {accuracy_file}\")\n",
" \n",
" except Exception as e:\n",
" print(f\"\\n错误: {str(e)}\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"def merge_json_files(json_files):\n",
" # 创建结果字典\n",
" merged_result = {}\n",
" \n",
" # 遍历每个JSON文件\n",
" for file_data in json_files:\n",
" # 遍历每个视频\n",
" for video_name, video_data in file_data.items():\n",
" if video_name not in merged_result:\n",
" merged_result[video_name] = {}\n",
" \n",
" # 获取模型名称和original_answer\n",
" if \"video_analysis\" in video_data:\n",
" for model_name, model_data in video_data[\"video_analysis\"].items():\n",
" if \"original_answer\" in model_data:\n",
" merged_result[video_name][model_name] = model_data[\"original_answer\"]\n",
"\n",
" return merged_result\n",
"\n",
"# 读取所有JSON文件\n",
"json_files = [\n",
"\n",
" \"/home/zydi/VLM/result/analysis_results_室内右上角全景筛选_20250102_084327.json\",\n",
" \"/home/zydi/VLM/result/analysis_results_室内右上角全景筛选_20250102_084600.json\"\n",
"]\n",
"\n",
"# 读取所有文件内容\n",
"json_contents = []\n",
"for file_path in json_files:\n",
" with open(file_path, 'r', encoding='utf-8') as f:\n",
" json_contents.append(json.load(f))\n",
"\n",
"# 合并JSON\n",
"result = merge_json_files(json_contents)\n",
"\n",
"# 将结果写入新文件\n",
"with open('qwen_prompt.json', 'w', encoding='utf-8') as f:\n",
" json.dump(result, f, ensure_ascii=False, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def clean_places365_names():\n",
" with open('Places365.txt', 'r') as file:\n",
" lines = file.readlines()\n",
" \n",
" cleaned_lines = []\n",
" for line in lines:\n",
" # 移除开头的/x/部分\n",
" if line.strip(): # 确保不是空行\n",
" # 找到第二个/后的位置\n",
" start_pos = line.find('/', 1) + 1\n",
" # 获取中间部分(去掉末尾的数字)\n",
" name = line[start_pos:].rsplit(' ', 1)[0]\n",
" cleaned_lines.append(name)\n",
" \n",
" # 写入原文件\n",
" with open('Places365.txt', 'w') as file:\n",
" file.write('\\n'.join(cleaned_lines))\n",
"\n",
"# 执行函数\n",
"clean_places365_names()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def merge_json_files(json_files):\n",
" # 创建结果字典\n",
" merged_result = {}\n",
" \n",
" # 遍历每个JSON文件\n",
" for file_path, file_data in zip(json_files, json_contents):\n",
" # 从文件名中提取时间戳\n",
" file_timestamp = None\n",
" if \"_20250102_\" in file_path:\n",
" file_timestamp = file_path.split(\"_\")[-1].replace(\".json\", \"\")\n",
" \n",
" # 遍历每个视频\n",
" for video_name, video_data in file_data.items():\n",
" if video_name not in merged_result:\n",
" merged_result[video_name] = {\n",
" \"video_analysis\": {\n",
" \"qwen-7B\": {\n",
" \"original_answers\": {}\n",
" }\n",
" }\n",
" }\n",
" \n",
" # 获取original_answer并添加到对应时间戳下\n",
" if \"video_analysis\" in video_data and \"qwen-7B\" in video_data[\"video_analysis\"]:\n",
" model_data = video_data[\"video_analysis\"][\"qwen-7B\"]\n",
" if \"original_answer\" in model_data:\n",
" merged_result[video_name][\"video_analysis\"][\"qwen-7B\"][\"original_answers\"][file_timestamp] = model_data[\"original_answer\"]\n",
" \n",
" return merged_result\n",
"\n",
"# 读取所有JSON文件\n",
"json_files = [\n",
" \"/home/zydi/VLM/result/室内右上角全景筛选_20250102_065735.json\",\n",
" \"/home/zydi/VLM/result/室内右上角全景筛选_20250102_072352.json\",\n",
" \"/home/zydi/VLM/result/室内右上角全景筛选_20250102_072724.json\",\n",
" \"/home/zydi/VLM/result/室内右上角全景筛选_20250102_075545.json\"\n",
"]\n",
"\n",
"# 读取所有文件内容\n",
"json_contents = []\n",
"for file_path in json_files:\n",
" with open(file_path, 'r', encoding='utf-8') as f:\n",
" json_contents.append(json.load(f))\n",
"\n",
"# 合并JSON\n",
"result = merge_json_files(json_files)\n",
"\n",
"# 将结果写入新文件\n",
"with open('qwen_prompt.json', 'w', encoding='utf-8') as f:\n",
" json.dump(result, f, ensure_ascii=False, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"处理完成!输出文件:formatted_qwen_prompt.json\n"
]
}
],
"source": [
"import json\n",
"import re\n",
"\n",
"def format_answer(text):\n",
" # 处理标题\n",
" text = re.sub(r'###\\s+', '\\n### ', text)\n",
" text = re.sub(r'####\\s+', '\\n#### ', text)\n",
" \n",
" # 处理列表项\n",
" text = re.sub(r'(?m)^-\\s+', '\\n- ', text)\n",
" text = re.sub(r'(?m)^•\\s+', '\\n• ', text)\n",
" \n",
" # 处理加粗文本\n",
" text = re.sub(r'\\*\\*([^*]+)\\*\\*:', '\\n**\\\\1**:', text)\n",
" \n",
" # 移除多余的空行\n",
" text = re.sub(r'\\n{3,}', '\\n\\n', text)\n",
" \n",
" return text.strip()\n",
"\n",
"def process_json_file(input_file, output_file):\n",
" try:\n",
" with open(input_file, 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
" \n",
" # 递归处理所有的original_answers\n",
" def process_dict(d):\n",
" for k, v in d.items():\n",
" if k == 'original_answers' or k == 'original_answer':\n",
" if isinstance(v, dict):\n",
" for sub_k, sub_v in v.items():\n",
" v[sub_k] = format_answer(sub_v)\n",
" else:\n",
" d[k] = format_answer(v)\n",
" elif isinstance(v, dict):\n",
" process_dict(v)\n",
" elif isinstance(v, list):\n",
" for item in v:\n",
" if isinstance(item, dict):\n",
" process_dict(item)\n",
" \n",
" process_dict(data)\n",
" \n",
" with open(output_file, 'w', encoding='utf-8') as f:\n",
" json.dump(data, f, ensure_ascii=False, indent=2)\n",
" \n",
" print(f\"处理完成!输出文件:{output_file}\")\n",
" \n",
" except Exception as e:\n",
" print(f\"处理文件时出错:{str(e)}\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" # 示例使用\n",
" input_file = \"qwen_prompt.json\" # 输入文件路径\n",
" output_file = \"formatted_qwen_prompt.json\" # 输出文件路径\n",
" process_json_file(input_file, output_file)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"from datetime import datetime, timedelta\n",
"\n",
"def update_timestamps(json_data):\n",
" # 设置基准时间为 2024.12.31 02:00:00\n",
" base_time = datetime(2024, 12, 31, 16, 12, 0)\n",
" \n",
" # 遍历所有视频条目\n",
" for i, (video_key, video_data) in enumerate(json_data.items()):\n",
" # 计算新的时间戳 (每条记录间隔1分钟)\n",
" new_time = base_time + timedelta(minutes=i)\n",
" \n",
" # 更新时间戳\n",
" video_data['timestamp'] = new_time.strftime('%Y-%m-%d %H:%M:%S')\n",
" \n",
" return json_data\n",
"\n",
"# 示例使用\n",
"with open('/home/zydi/VLM/result/qwen/左侧吃饭2_20241231_1612.json', 'r', encoding='utf-8') as f:\n",
" data = json.load(f)\n",
"\n",
"# 更新时间戳\n",
"updated_data = update_timestamps(data)\n",
"\n",
"# 保存更新后的文件\n",
"with open('/home/zydi/VLM/result/qwen/左侧吃饭2_20241231_1612.json', 'w', encoding='utf-8') as f:\n",
" json.dump(updated_data, f, ensure_ascii=False, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"成功保存: camera001_20241231_1612\n",
"成功保存: camera001_20241231_1423\n",
"成功保存: camera001_20241231_1232\n",
"成功保存: camera001_20241231_0200\n",
"成功保存: camera001_20241231_2300\n",
"成功保存: camera001_20241231_2010\n",
"成功保存: camera001_20241231_0923\n"
]
}
],
"source": [
"import json\n",
"import redis\n",
"import os\n",
"import re\n",
"\n",
"def save_json_to_redis(file_path):\n",
" try:\n",
" # 连接Redis\n",
" r = redis.Redis(\n",
" host=\"222.186.10.253\",\n",
" port=6379,\n",
" password=\"Obscura@2024\",\n",
" db=207\n",
" )\n",
" \n",
" # 读取JSON文件\n",
" with open(file_path, 'r', encoding='utf-8') as f:\n",
" content = json.load(f)\n",
" \n",
" # 从文件路径中提取文件名\n",
" file_name = os.path.basename(file_path)\n",
" \n",
" # 从文件名中提取时间戳\n",
" timestamp_match = re.search(r'_(\\d{8}_\\d{4})', file_name)\n",
" if timestamp_match:\n",
" timestamp = timestamp_match.group(1)\n",
" # 构建新的key\n",
" new_key = f\"camera001_{timestamp}\"\n",
" \n",
" # 将内容转换为JSON字符串\n",
" json_str = json.dumps(content, ensure_ascii=False)\n",
" \n",
" # 保存到Redis\n",
" r.set(new_key, json_str)\n",
" print(f\"成功保存: {new_key}\")\n",
" else:\n",
" print(f\"无法从文件名提取时间戳: {file_name}\")\n",
" \n",
" # 关闭Redis连接\n",
" r.close()\n",
" \n",
" except Exception as e:\n",
" print(f\"处理文件 {file_path} 时出错: {str(e)}\")\n",
"\n",
"# 要处理的文件列表\n",
"files_to_process = [\n",
" \"result/qwen/左侧吃饭2_20241231_1612.json\",\n",
" \"result/qwen/左侧吃饭1_20241231_1423.json\",\n",
" \"result/qwen/右上角吃饭_20241231_1232.json\",\n",
" \"result/qwen/球机沙发正面_20241231_0200.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2300.json\",\n",
" \"result/qwen/室内右上角全景_20241231_2010.json\",\n",
" \"result/qwen/球机椅子左侧面_20241231_0923.json\"\n",
"]\n",
"\n",
"# 处理每个文件\n",
"for file_path in files_to_process:\n",
" save_json_to_redis(file_path)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"成功获取key: A01_20250107_1600\n",
"\n",
"数据已保存到: .//redis_data_20250107_085228.json\n",
"共处理 1 个key\n"
]
}
],
"source": [
"import json\n",
"import redis\n",
"from datetime import datetime\n",
"\n",
"def fetch_from_redis_and_save(keys=None, output_dir=\"./\"):\n",
" \"\"\"\n",
" 从Redis获取指定keys的数据并保存为JSON文件\n",
" \n",
" 参数:\n",
" - keys: 要获取的key列表,如果为None则获取所有keys\n",
" - output_dir: 输出文件夹路径\n",
" \"\"\"\n",
" try:\n",
" # 连接Redis\n",
" r = redis.Redis(\n",
" host=\"222.186.10.253\",\n",
" port=6379,\n",
" password=\"Obscura@2024\",\n",
" db=210\n",
" )\n",
" \n",
" # 如果没有指定keys,获取所有keys\n",
" if keys is None:\n",
" keys = r.keys(\"camera001_*\") # 获取所有以camera001_开头的key\n",
" keys = [key.decode('utf-8') for key in keys] # 将bytes转换为字符串\n",
" \n",
" # 创建一个字典存储所有数据\n",
" all_data = {}\n",
" \n",
" # 获取每个key的数据\n",
" for key in keys:\n",
" try:\n",
" value = r.get(key)\n",
" if value:\n",
" # 将Redis中的JSON字符串转换为Python对象\n",
" data = json.loads(value)\n",
" all_data[key] = data\n",
" print(f\"成功获取key: {key}\")\n",
" except Exception as e:\n",
" print(f\"处理key {key} 时出错: {str(e)}\")\n",
" \n",
" # 生成输出文件名(使用时间戳)\n",
" timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n",
" output_file = f\"{output_dir}/redis_data_{timestamp}.json\"\n",
" \n",
" # 保存为JSON文件\n",
" with open(output_file, 'w', encoding='utf-8') as f:\n",
" json.dump(all_data, f, ensure_ascii=False, indent=2)\n",
" \n",
" print(f\"\\n数据已保存到: {output_file}\")\n",
" print(f\"共处理 {len(all_data)} 个key\")\n",
" \n",
" # 关闭Redis连接\n",
" r.close()\n",
" \n",
" except Exception as e:\n",
" print(f\"发生错误: {str(e)}\")\n",
"\n",
"# 使用示例\n",
"if __name__ == \"__main__\":\n",
" # 可以指定特定的keys\n",
" specific_keys = [\n",
" \"A01_20250107_1600\"\n",
" # \"camera001_20241231_1423\",\n",
" # \"camera001_20241231_1232\"\n",
" ]\n",
" \n",
" # 获取指定的keys\n",
" fetch_from_redis_and_save(keys=specific_keys)\n",
" \n",
" # 或者获取所有keys\n",
" # fetch_from_redis_and_save()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "kafka",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
+48
View File
@@ -0,0 +1,48 @@
# 数据目录结构说明
## 概述
此目录包含系统运行所需的各类数据文件,包括视频录制、图片提取、人脸数据等。每个子目录都有其特定用途。
## 目录结构
### recordings/
视频录制目录,存储从RTSP流采集的视频片段。
- 按摄像头ID分子目录(如A01、B02等)
- 视频文件命名格式:`{camera_id}_{date}_{time}.avi`
- 示例:`A01_20240315_143000.avi`
### images/
视频关键帧提取目录,存储从视频中提取的图片。
- 保持与recordings/相同的目录结构
- 图片文件命名与原视频文件对应
- 示例:`A01_20240315_143000.jpg`
### crop/
人体检测裁剪图片目录,存储从原始图片中裁剪出的人体区域。
- 保持与images/相同的目录结构
- 文件命名格式:`{original_name}_{index}.jpg`
- 示例:`A01_20240315_143000_0.jpg`
### data/
人脸特征数据目录,用于存储人脸识别所需的参考数据。
- 按人员姓名/ID分子目录
- 每个子目录包含该人员的人脸图片
- 用于人脸特征提取和身份匹配
## 注意事项
1. 所有目录会自动创建,无需手动创建
2. 定期清理过期数据,避免存储空间不足
3. 建议配置数据备份策略
4. 确保目录具有适当的读写权限
## 数据保留策略
- recordings/: 建议保留30天
- images/: 与视频保持同步
- crop/: 可在分析完成后清理
- data/: 长期保存,定期更新
## 存储建议
- 使用高性能存储设备
- 监控磁盘使用情况
- 实施数据压缩策略
- 配置定期备份
+93
View File
@@ -0,0 +1,93 @@
# 前端界面说明文档
## 概述
本目录包含智能视频分析系统的Web前端界面实现,提供了直观的数据可视化和交互功能。
## 文件结构
### web.html
主要的Web界面文件,包含:
- 完整的HTML结构
- 内嵌CSS样式
- JavaScript交互逻辑
### cls.js
行为分析相关的JavaScript类和函数库:
- 数据处理逻辑
- 图表配置
- 事件处理
## 主要功能
### 1. 摄像头管理
- 多摄像头切换
- 摄像头状态显示
- 实时数据更新
### 2. 数据展示
- 行为统计卡片
- 时间轴可视化
- 图表分析
- 行为分布图
- 时段活动图
- 异常行为统计
### 3. 交互功能
- 日期选择
- 关键词搜索
- 数据筛选
- 时间轴缩放
- 自动刷新
### 4. 报告功能
- 每日分析报告
- 数据导出
- 历史记录查询
## 界面组件
### 侧边栏
- 系统Logo
- 摄像头列表
- 导航菜单
### 主内容区
- 工具栏
- 搜索框
- 日期选择器
- 刷新按钮
- 筛选按钮
- 数据统计卡片
- 时间轴面板
- 图表区域
- 事件列表
## 样式主题
- 使用CSS变量定义主题色
- 响应式布局
- 现代化UI设计
- Material Icons图标
## 使用说明
### 开发环境
1. 确保正确配置API基础URL
2. 使用现代浏览器进行开发和测试
3. 建议使用VS Code等支持HTML/CSS/JS的编辑器
### 部署步骤
1. 将frontend目录下的所有文件部署到Web服务器
2. 确保web.html能够正确访问后端API
3. 检查所有静态资源是否正确加载
### 注意事项
- 保持与后端API的版本兼容
- 定期清理浏览器缓存
- 监控前端性能
- 注意数据刷新频率
## 开发建议
1. 使用浏览器开发者工具进行调试
2. 遵循代码注释规范
3. 定期测试各项功能
4. 注意性能优化
View File
View File
View File
View File
+45
View File
@@ -0,0 +1,45 @@
#!/bin/bash
# 设置Python虚拟环境(如果有的话)
# source venv/bin/activate
# 定义日志文件
LOG_DIR="logs"
mkdir -p $LOG_DIR
echo "启动监控视频分析系统..."
# 检查必要的Python文件是否存在
required_files=("rtsp2video.py" "video2image.py" "pose.py" "face.py" "qwen.py")
for file in "${required_files[@]}"; do
if [ ! -f "$file" ]; then
echo "错误: 找不到 $file"
exit 1
fi
done
# 启动视频采集模块
echo "启动视频采集模块..."
python rtsp2video.py > $LOG_DIR/rtsp2video.log 2>&1 &
RTSP_PID=$!
sleep 5 # 等待视频采集模块初始化
# 检查rtsp2video.py是否成功启动
if ! ps -p $RTSP_PID > /dev/null; then
echo "错误: 视频采集模块启动失败"
exit 1
fi
# 启动分析模块
echo "启动分析模块..."
python app/function/video2image.py > $LOG_DIR/video2image.log 2>&1 &
python app/function/pose.py > $LOG_DIR/pose.log 2>&1 &
python app/function/face.py > $LOG_DIR/face.log 2>&1 &
python app/function/qwen.py > $LOG_DIR/qwen.log 2>&1 &
# 等待其他服务启动完成
sleep 10
echo "所有服务已启动!"
echo "查看日志文件夹 '$LOG_DIR' 以获取详细信息"
echo "使用 'ps aux | grep python' 查看运行状态"
+33
View File
@@ -0,0 +1,33 @@
#!/bin/bash
echo "停止所有Python进程..."
# 定义要停止的进程列表
processes=("app/function/rtsp2video.py"
"app/function/video2image.py"
"app/function/pose.py"
"app/function/face.py"
"app/function/qwen.py")
# 遍历并停止每个进程
for process in "${processes[@]}"; do
pid=$(pgrep -f "python $process")
if [ ! -z "$pid" ]; then
echo "停止 $process (PID: $pid)..."
pkill -f "python $process"
sleep 1
if pgrep -f "python $process" > /dev/null; then
echo "警告: $process 可能未正常停止"
fi
else
echo "$process 未在运行"
fi
done
# 检查是否还有相关进程在运行
if pgrep -f "python.*\.py" > /dev/null; then
echo "警告: 仍有Python进程在运行,请手动检查"
ps aux | grep "python.*\.py" | grep -v grep
else
echo "所有服务已成功停止!"
fi
-339
View File
@@ -1,339 +0,0 @@
clapping
praying
dropping
burying
covering
flooding
leaping
drinking
slapping
cuddling
sleeping
preaching
raining
stitching
spraying
twisting
coaching
submerging
breaking
tuning
boarding
running
destroying
competing
giggling
shoveling
chasing
flicking
pouring
buttoning
hammering
carrying
surfing
pulling
squatting
aiming
crouching
tapping
skipping
washing
winking
queuing
locking
stopping
sneezing
flipping
sewing
clipping
working
rocking
asking
playing+fun
camping
plugging
pedaling
constructing
slipping
sweeping
screwing
shrugging
hitchhiking
cracking
scratching
trimming
selling
marching
stirring
kissing
jumping
starting
clinging
socializing
picking
splashing
licking
kicking
sliding
filming
driving
handwriting
steering
filling
crashing
stealing
pressing
shouting
hiking
vacuuming
pointing
giving
diving
hugging
building
swerving
dining
floating
cheerleading
leaning
sailing
singing
playing
hitting
bubbling
joining
bathing
raising
sitting
drawing
protesting
rinsing
coughing
smashing
slicing
balancing
rafting
kneeling
dunking
brushing
crushing
rubbing
punting
watering
playing+music
removing
tearing
imitating
teaching
cooking
reaching
studying
serving
bulldozing
shaking
discussing
dragging
gardening
performing
officiating
photographing
sowing
dripping
writing
clawing
bending
boxing
mopping
gripping
flowing
digging
tripping
cheering
buying
bicycling
feeding
emptying
unpacking
sketching
standing
weeding
stacking
drying
crying
spinning
frying
cutting
paying
eating
lecturing
dancing
adult+female+speaking
boiling
peeling
wrapping
wetting
attacking
welding
putting
swinging
carving
walking
dressing
inflating
climbing
shredding
reading
sanding
frowning
closing
hunting
clearing
launching
packaging
fishing
spilling
leaking
knitting
boating
sprinkling
baptizing
playing+sports
rolling
spitting
dipping
riding
chopping
extinguishing
applauding
calling
talking
adult+male+speaking
snowing
shaving
marrying
rising
laughing
crawling
flying
assembling
injecting
landing
operating
packing
descending
falling
entering
pushing
sawing
smelling
overflowing
fighting
waking
barbecuing
skating
painting
drilling
punching
tying
manicuring
plunging
grilling
pitching
towing
telephoning
crafting
knocking
playing+videogames
storming
placing
turning
barking
child+singing
opening
waxing
juggling
mowing
shooting
sniffing
interviewing
stomping
chewing
arresting
grooming
rowing
bowing
gambling
saluting
fueling
autographing
throwing
drenching
waving
signing
repairing
baking
smoking
skiing
drumming
child+speaking
blowing
cleaning
combing
spreading
racing
combusting
adult+female+singing
fencing
swimming
adult+male+singing
snuggling
shopping
bouncing
dusting
stroking
snapping
biting
roaring
guarding
unloading
lifting
instructing
folding
measuring
whistling
exiting
stretching
taping
squinting
catching
draining
massaging
scrubbing
handcuffing
celebrating
jogging
colliding
bowling
resting
blocking
smiling
tattooing
erupting
howling
parading
grinning
sprinting
hanging
planting
speaking
ascending
yawning
cramming
burning
wrestling
poking
tickling
exercising
loading
piloting
typing
-365
View File
@@ -1,365 +0,0 @@
airfield
airplane_cabin
airport_terminal
alcove
alley
amphitheater
amusement_arcade
amusement_park
apartment_building/outdoor
aquarium
aqueduct
arcade
arch
archaelogical_excavation
archive
arena/hockey
arena/performance
arena/rodeo
army_base
art_gallery
art_school
art_studio
artists_loft
assembly_line
athletic_field/outdoor
atrium/public
attic
auditorium
auto_factory
auto_showroom
badlands
bakery/shop
balcony/exterior
balcony/interior
ball_pit
ballroom
bamboo_forest
bank_vault
banquet_hall
bar
barn
barndoor
baseball_field
basement
basketball_court/indoor
bathroom
bazaar/indoor
bazaar/outdoor
beach
beach_house
beauty_salon
bedchamber
bedroom
beer_garden
beer_hall
berth
biology_laboratory
boardwalk
boat_deck
boathouse
bookstore
booth/indoor
botanical_garden
bow_window/indoor
bowling_alley
boxing_ring
bridge
building_facade
bullring
burial_chamber
bus_interior
bus_station/indoor
butchers_shop
butte
cabin/outdoor
cafeteria
campsite
campus
canal/natural
canal/urban
candy_store
canyon
car_interior
carrousel
castle
catacomb
cemetery
chalet
chemistry_lab
childs_room
church/indoor
church/outdoor
classroom
clean_room
cliff
closet
clothing_store
coast
cockpit
coffee_shop
computer_room
conference_center
conference_room
construction_site
corn_field
corral
corridor
cottage
courthouse
courtyard
creek
crevasse
crosswalk
dam
delicatessen
department_store
desert/sand
desert/vegetation
desert_road
diner/outdoor
dining_hall
dining_room
discotheque
doorway/outdoor
dorm_room
downtown
dressing_room
driveway
drugstore
elevator/door
elevator_lobby
elevator_shaft
embassy
engine_room
entrance_hall
escalator/indoor
excavation
fabric_store
farm
fastfood_restaurant
field/cultivated
field/wild
field_road
fire_escape
fire_station
fishpond
flea_market/indoor
florist_shop/indoor
food_court
football_field
forest/broadleaf
forest_path
forest_road
formal_garden
fountain
galley
garage/indoor
garage/outdoor
gas_station
gazebo/exterior
general_store/indoor
general_store/outdoor
gift_shop
glacier
golf_course
greenhouse/indoor
greenhouse/outdoor
grotto
gymnasium/indoor
hangar/indoor
hangar/outdoor
harbor
hardware_store
hayfield
heliport
highway
home_office
home_theater
hospital
hospital_room
hot_spring
hotel/outdoor
hotel_room
house
hunting_lodge/outdoor
ice_cream_parlor
ice_floe
ice_shelf
ice_skating_rink/indoor
ice_skating_rink/outdoor
iceberg
igloo
industrial_area
inn/outdoor
islet
jacuzzi/indoor
jail_cell
japanese_garden
jewelry_shop
junkyard
kasbah
kennel/outdoor
kindergarden_classroom
kitchen
lagoon
lake/natural
landfill
landing_deck
laundromat
lawn
lecture_room
legislative_chamber
library/indoor
library/outdoor
lighthouse
living_room
loading_dock
lobby
lock_chamber
locker_room
mansion
manufactured_home
market/indoor
market/outdoor
marsh
martial_arts_gym
mausoleum
medina
mezzanine
moat/water
mosque/outdoor
motel
mountain
mountain_path
mountain_snowy
movie_theater/indoor
museum/indoor
museum/outdoor
music_studio
natural_history_museum
nursery
nursing_home
oast_house
ocean
office
office_building
office_cubicles
oilrig
operating_room
orchard
orchestra_pit
pagoda
palace
pantry
park
parking_garage/indoor
parking_garage/outdoor
parking_lot
pasture
patio
pavilion
pet_shop
pharmacy
phone_booth
physics_laboratory
picnic_area
pier
pizzeria
playground
playroom
plaza
pond
porch
promenade
pub/indoor
racecourse
raceway
raft
railroad_track
rainforest
reception
recreation_room
repair_shop
residential_neighborhood
restaurant
restaurant_kitchen
restaurant_patio
rice_paddy
river
rock_arch
roof_garden
rope_bridge
ruin
runway
sandbox
sauna
schoolhouse
science_museum
server_room
shed
shoe_shop
shopfront
shopping_mall/indoor
shower
ski_resort
ski_slope
sky
skyscraper
slum
snowfield
soccer_field
stable
stadium/baseball
stadium/football
stadium/soccer
stage/indoor
stage/outdoor
staircase
storage_room
street
subway_station/platform
supermarket
sushi_bar
swamp
swimming_hole
swimming_pool/indoor
swimming_pool/outdoor
synagogue/outdoor
television_room
television_studio
temple/asia
throne_room
ticket_booth
topiary_garden
tower
toyshop
train_interior
train_station/platform
tree_farm
tree_house
trench
tundra
underwater/ocean_deep
utility_room
valley
vegetable_garden
veterinarians_office
viaduct
village
vineyard
volcano
volleyball_court/outdoor
waiting_room
water_park
water_tower
waterfall
watering_hole
wave
wet_bar
wheat_field
wind_farm
windmill
yard
youth_hostel
zen_garden
-343
View File
@@ -1,343 +0,0 @@
飞机场
飞机舱
机场航站楼
壁龛
小巷
圆形剧场
游戏厅
游乐园
公寓楼
水族馆
渡槽
拱廊
拱门
考古发掘现场
档案馆
冰球场
表演场地
竞技场
军事基地
美术馆
艺术学校
艺术工作室
艺术家阁楼
装配线
运动场
中庭
阁楼
礼堂
汽车制造厂
汽车展厅
荒地
面包店
阳台
球池
舞厅
竹林
银行金库
宴会厅
酒吧
谷仓
谷仓门
棒球场
地下室
篮球场
浴室
集市
海滩
海滨别墅
美容院
寝室
卧室
啤酒花园
啤酒大厅
泊位
生物实验室
木板路
船甲板
船库
书店
展位
植物园
保龄球馆
拳击台
建筑外立面
斗牛场
墓室
公交车内部
公交站
肉店
孤峰
小屋
自助餐厅
露营地
校园
运河
糖果店
峡谷
汽车内部
旋转木马
城堡
地下墓穴
墓地
瑞士木屋
化学实验室
儿童房
教堂
教室
无尘室
悬崖
壁橱
服装店
海岸
驾驶舱
咖啡店
计算机房
会议中心
会议室
建筑工地
玉米地
畜栏
走廊
农舍
法院
庭院
小溪
裂缝
人行横道
水坝
熟食店
百货商店
沙漠
沙漠公路
餐馆
餐厅
饭厅
迪斯科舞厅
门口
宿舍
市中心
更衣室
车道
药店
电梯
电梯大厅
电梯井
大使馆
机房
入口大厅
自动扶梯
挖掘现场
布料店
农场
快餐店
农田
田野
田间小路
防火梯
消防站
鱼塘
花店
美食广场
足球场
阔叶林
森林小径
林间小路
正式花园
喷泉
厨房
车库
加油站
凉亭/外部
杂货店
礼品店
冰川
高尔夫球场
洞穴
体育馆
机库
港口
五金店
干草地
直升机场
高速公路
家庭办公室
家庭影院
医院
病房
温泉
酒店
酒店房间
房屋
狩猎小屋/室外
冰淇淋店
浮冰
冰架
溜冰场
冰山
冰屋
工业区
旅馆/室外
小岛
牢房
日本花园
珠宝店
废品场
古堡
狗舍
幼儿园教室
厨房
泻湖
天然湖泊
垃圾场
停机坪
自助洗衣店
草坪
讲堂
议会厅
图书馆
灯塔
客厅
装卸码头
大堂
闸室
更衣室
豪宅
预制房屋
市场
沼泽
武术馆
陵墓
麦地那
夹层
护城河/水
清真寺/室外
汽车旅馆
山路
雪山
电影院
博物馆
博物馆/室外
音乐工作室
自然历史博物馆
托儿所
疗养院
啤酒干燥房
海洋
办公室
办公楼
办公隔间
石油钻井平台
手术室
果园
乐池
宝塔
宫殿
食品储藏室
公园
停车场
牧场
露台
亭子
宠物店
药店
电话亭
物理实验室
野餐区
码头
比萨店
操场
游戏室
广场
池塘
门廊
林荫道
酒吧/室内
赛马场
赛车道
木筏
铁轨
热带雨林
接待处
娱乐室
修理店
住宅区
餐厅
餐厅厨房
餐厅露台
稻田
河流
岩石拱门
屋顶花园
索桥
废墟
跑道
沙地
桑拿房
学校
科学博物馆
服务器机房
棚屋
鞋店
店面
购物中心
淋浴间
滑雪场
滑雪坡
天空
摩天大楼
贫民窟
雪地
足球场
马厩
棒球场
橄榄球场
足球场
舞台
楼梯
储藏室
街道
地铁站/站台
超市
寿司店
沼泽地
游泳池
犹太教堂
电视房
电视演播室
亚洲寺庙
王座室
售票处
园艺造型花园
塔楼
玩具店
火车内部
火车站
树木农场
树屋
战壕
苔原
深海
设备间
山谷
菜园
兽医诊所
高架桥
乡村
葡萄园
火山
排球场
水上乐园
水塔
瀑布
水坑
海浪
酒吧
麦田
风力发电场
风车
庭院
青年旅舍
禅园
-6
View File
@@ -1,6 +0,0 @@
# 测试代码说明
## 一部分是测试代码,可以忽略
1.测试代码使用数据为:detaset目录
## 另一部分是代码备份,可以忽略
-600
View File
@@ -1,600 +0,0 @@
import os
import json
import torch
from datetime import datetime, timedelta
from PIL import Image
import io
import re
import base64
import requests
from decord import VideoReader
from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
# 配置
QWEN_MODEL_PATH = "/obscura/models/qwen/Qwen2-VL-2B-Instruct"
OLLAMA_URL = "http://127.0.0.1:11434/api/generate"
# 初始化 Qwen 模型 (使用 CUDA:1)
print("正在初始化 Qwen 模型 (CUDA:1)...")
model = Qwen2VLForConditionalGeneration.from_pretrained(
QWEN_MODEL_PATH,
torch_dtype="auto",
device_map="cuda:1"
)
min_pixels = 128*28*28
max_pixels = 512*28*28
processor = AutoProcessor.from_pretrained(
QWEN_MODEL_PATH,
min_pixels=min_pixels,
max_pixels=max_pixels
)
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 16
self.device = "cuda:1"
self.qwen_model = model
self.qwen_processor = processor
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file)
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
if len(frame_idx) > self.MAX_NUM_FRAMES:
frame_idx = uniform_sample(frame_idx, self.MAX_NUM_FRAMES)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
print('num frames:', len(frames))
return frames
def process_with_qwen(self, media_data, object_name, media_type='image'):
"""使用 Qwen 模型处理媒体"""
if media_type == 'video':
frames = self.encode_video(media_data)
media_content = {"type": "video", "video": frames, "fps": 1.0}
else:
image = Image.open(io.BytesIO(media_data))
media_content = {"type": "image", "image": image}
messages = [
{
"role": "user",
"content": [
media_content,
{"type": "text", "text": self._get_analysis_prompt(media_type)}
],
}
]
text = self.qwen_processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = self.qwen_processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to(self.device)
generated_ids = self.qwen_model.generate(**inputs, max_new_tokens=2048)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
answer = self.qwen_processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)[0]
return {
"model": "qwen",
"original_answer": answer,
"extracted_info": self.extract_info(answer)
}
def process_with_minicpm(self, media_data, object_name, media_type='image'):
"""使用 MiniCPM 模型处理媒体 (CUDA:0)"""
if media_type == 'video':
frames = self.encode_video(media_data)
encoded_frames = [self.image_to_base64(frame) for frame in frames]
else:
image = Image.open(io.BytesIO(media_data))
encoded_frames = [self.image_to_base64(image)]
payload = {
"model": "minicpm-v",
"prompt": self._get_analysis_prompt(media_type),
"images": encoded_frames,
"cuda_device": 0
}
response = requests.post(OLLAMA_URL, json=payload, stream=True)
answer = self.process_stream_response(response)
return {
"model": "minicpm",
"original_answer": answer,
"extracted_info": self.extract_info(answer)
}
def process_with_llama(self, media_data, object_name, media_type='image'):
"""使用 Llama 模型处理媒体 (CUDA:0),对视频处理3-5帧"""
try:
if media_type == 'video':
frames = self.encode_video(media_data)
num_frames = min(max(3, len(frames)), 5)
if len(frames) > num_frames:
gap = len(frames) / num_frames
frame_indices = [int(i * gap + gap / 2) for i in range(num_frames)]
selected_frames = [frames[i] for i in frame_indices]
else:
selected_frames = frames[:num_frames]
print(f"Llama 将处理 {len(selected_frames)} 帧图像")
frame_results = []
for i, frame in enumerate(selected_frames, 1):
print(f"正在处理第 {i}/{len(selected_frames)} 帧...")
try:
encoded_frame = self.image_to_base64(frame)
payload = {
"model": "llama3.2-vision",
"prompt": self._get_analysis_prompt('image', model_type='llama'),
"images": [encoded_frame],
"cuda_device": 0
}
response = requests.post(OLLAMA_URL, json=payload, stream=True)
answer = self.process_stream_response(response)
if not answer:
print(f"警告:第 {i} 帧未获得有效响应")
answer = "No valid response from model"
extracted_info = self.extract_info(answer, is_english=True)
frame_results.append({
"frame_index": i,
"original_answer": answer,
"extracted_info": extracted_info
})
except Exception as e:
print(f"处理第 {i} 帧时出错: {str(e)}")
frame_results.append({
"frame_index": i,
"error": str(e),
"extracted_info": {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"feature": []
}
})
# 每帧处理完后清理内存
if torch.cuda.is_available():
torch.cuda.empty_cache()
# 合并结果
merged_result = self.merge_frame_results(frame_results)
return {
"frame_results": frame_results,
"merged_result": merged_result,
"num_processed_frames": len(selected_frames)
}
else: # 图像处理
image = Image.open(io.BytesIO(media_data))
encoded_frame = self.image_to_base64(image)
payload = {
"model": "llama3.2-vision",
"prompt": self._get_analysis_prompt(media_type, model_type='llama'),
"images": [encoded_frame],
"cuda_device": 0
}
response = requests.post(OLLAMA_URL, json=payload, stream=True)
answer = self.process_stream_response(response)
if not answer:
answer = "No valid response from model"
return {
"original_answer": answer,
"extracted_info": self.extract_info(answer)
}
except Exception as e:
print(f"Llama 处理失败: {str(e)}")
return {
"error": str(e),
"frame_results": [],
"merged_result": {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"feature": []
},
"num_processed_frames": 0
}
def process_stream_response(self, response):
"""处理流式响应并返回完整答案"""
full_response = ""
try:
for line in response.iter_lines():
if line:
try:
json_response = json.loads(line)
if 'response' in json_response:
full_response += json_response['response']
except json.JSONDecodeError:
continue
except Exception as e:
print(f"处理响应流时出错: {str(e)}")
return full_response.strip()
@staticmethod
def image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="PNG")
return base64.b64encode(buffered.getvalue()).decode()
def _get_analysis_prompt(self, media_type, model_type='qwen'):
"""获取分析提示词,为 Llama 提供英文版本"""
if model_type == 'llama':
return f"""Please analyze this {('surveillance video' if media_type == 'video' else 'surveillance image')} in detail, including the following aspects:
1. Exact count of people in the scene
2. Individual behavior analysis of each person
3. Facial expression recognition and emotional state assessment
4. Detailed description of the overall scene and environment
5. Interactions between people
6. Environmental conditions
7. Items and furniture in the environment
8. Any suspicious or abnormal activities
9. Specific characteristics of people (estimated age range, gender, clothing)
10. {'Movement patterns and directions' if media_type == 'video' else 'Positions and postures'} of people
11. Items or objects being carried
12. Group dynamics and gathering situations
13. Timestamp information (if visible)
Please describe in a clear, organized format and highlight important findings."""
else:
# 其他模型继续使用中文提示词
return f"""请对这{'段监控视频' if media_type == 'video' else '张监控图像'}进行详细分析,包括以下方面:
1. 场景中人数的精确统计
2. 每个人的个人行为分析
3. 面部表情识别和情绪状态评估
4. 整体场景和环境的详细描述
5. 人与人之间的互动情况
6. 详细的环境条件描述
7. 环境中出现的物品和家具
8. 任何可疑或异常活动
9. 人员的具体特征(估计年龄范围、性别、着装)
10. 人员的{'移动模式和方向' if media_type == 'video' else '位置和姿态'}
11. 携带的物品或物体
12. 群体动态和聚集情况
13. {'视频' if media_type == 'video' else '图像'}中的时间戳信息(如果有)
请用清晰、有条理的格式描述,并突出重要发现。"""
def extract_info(self, answer, is_english=True):
"""提取信息,支持中英文"""
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"feature": []
}
if is_english:
# 英文环境关键词
environments = ["office", "indoor", "outdoor", "meeting room", "room", "classroom",
"living room", "bedroom", "kitchen", "bathroom", "hallway", "corridor"]
# 英文数字模式
people_patterns = [
r'(\d+)\s*(person|people|individual|man|woman|men|women|student|worker|employee)',
r'(one|two|three|four|five|six|seven|eight|nine|ten)\s*(person|people|individual)',
r'(single|few|several|multiple)\s*(person|people|individual)',
r'(male|female)\s*(person|individual)',
r'(adult|child|teenager|elderly)',
r'(worker|student|customer|visitor|passenger)',
r'(crowd|group|audience)',
r'(man|woman|boy|girl)'
]
# 英文动作词
actions = ["sleeping", "sitting", "eating", "standing", "falling", "dancing", "squatting",
"turning", "jumping", "lying", "talking", "walking", "running", "reading",
"writing", "studying", "using phone", "dining", "moving", "working",
"using computer", "drinking", "organizing", "cleaning"]
# 英文情绪词
emotions = ["happy", "angry", "sad", "surprised", "scared", "disgusted", "calm",
"relaxed", "neutral", "focused", "thoughtful", "excited", "tired", "serious"]
# 英文物品词
objects = ["water bottle", "office supplies", "document", "computer", "fan", "mouse",
"keyboard", "tissue", "book", "pen", "bag", "box", "cup", "mug", "glass",
"folder", "backpack", "phone", "laptop", "notebook", "paper"]
# 英文家具词
furniture = ["chair", "table", "coffee table", "file cabinet", "bed", "sofa", "cabinet",
"shelf", "camera", "cushion", "office chair", "TV", "whiteboard", "monitor",
"storage rack", "desk"]
# 英文特征词
features = ["wearing glasses", "no glasses", "long hair", "short hair", "wearing hat",
"no hat", "wearing mask", "no mask", "male", "female", "overweight", "slim",
"tall", "short", "adult", "child", "elderly", "young", "middle-aged"]
# 英文数字转换
num_word_to_digit = {
'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5,
'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10
}
else:
# 中文环境关键词
environments = ["办公室", "室内", "室外", "会议室", "房间", "教室",
"客厅", "卧室", "厨房", "浴室", "走廊", "过道"]
# 中文数字模式
people_patterns = [
r'(\d+)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一|二|三|四|五|六|七|八|九|十)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一个|几个)\s*(人|个人|员工|用户|小朋友|成年人|女性|男性)',
r'\s*(名|位)\s*(人|员工|用户|小朋友|成年人|女性|男性)?',
r'(男|女)(性|生|士)',
r'(成年|未成年|青少年|老年)\s*(人|群体)',
r'(员工|职工|工人|学生|顾客|观众|游客|乘客)',
r'(群众|民众|大众|公众)',
r'(男女|老少|老幼|大人|小孩)'
]
# 中文动作词
actions = ["睡眠", "", "", "", "摔倒", "跳舞", "", "转身", "跳跃", "",
"说话", "走路", "跑步", "阅读", "写字", "学习", "玩手机", "吃饭", "移动",
"工作", "使用电脑", "喝水", "整理", "打扫"]
# 中文情绪词
emotions = ["高兴", "愤怒", "悲伤", "惊讶", "恐惧", "厌恶", "平静", "放松",
"中性", "专注", "思考", "兴奋", "疲惫", "严肃"]
# 中文物品词
objects = ["水瓶", "办公用品", "文件", "电脑", "风扇", "鼠标", "键盘", "纸巾",
"", "", "袋子", "盒子", "水杯", "杯子", "玻璃杯", "文件夹",
"书包", "手机", "笔记本电脑", "笔记本", "纸张"]
# 中文家具词
furniture = ["椅子", "桌子", "茶几", "文件柜", "", "沙发", "柜子", "架子",
"摄像头", "靠垫", "办公椅", "电视", "白板", "显示器", "置物架", "办公桌"]
# 中文特征词
features = ["戴眼镜", "不戴眼镜", "长发", "短发", "戴帽子", "不戴帽子",
"戴口罩", "不戴口罩", "男性", "女性", "", "", "", "",
"成年人", "小孩", "老年人", "年轻人", "中年人"]
# 中文数字转换
num_word_to_digit = {
'': 1, '': 2, '': 3, '': 4, '': 5,
'': 6, '': 7, '': 8, '': 9, '': 10
}
# 提取环境信息
for env in environments:
if env.lower() in answer.lower():
info["environment"] = env
break
# 提取人数
for pattern in people_patterns:
match = re.search(pattern, answer.lower())
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in num_word_to_digit:
info["num_people"] = num_word_to_digit[match.group(1)]
break
# 提取列表类信息
answer_lower = answer.lower()
for action in actions:
if action.lower() in answer_lower:
info["actions"].append(action)
for object_item in objects:
if object_item.lower() in answer_lower:
info["objects"].append(object_item)
for furniture_item in furniture:
if furniture_item.lower() in answer_lower:
info["furniture"].append(furniture_item)
for emotion in emotions:
if emotion.lower() in answer_lower:
info["emotions"].append(emotion)
for feature in features:
if feature.lower() in answer_lower:
info["feature"].append(feature)
return info
def merge_frame_results(self, frame_results):
"""合并多帧分析结果"""
merged = {
"environment": None,
"num_people": None,
"actions": set(),
"objects": set(),
"furniture": set(),
"emotions": set(),
"feature": set()
}
# 环境取最常见的
environments = [r["extracted_info"]["environment"] for r in frame_results if r["extracted_info"]["environment"]]
if environments:
from collections import Counter
merged["environment"] = Counter(environments).most_common(1)[0][0]
# 人数取最大值
people_counts = [r["extracted_info"]["num_people"] for r in frame_results if r["extracted_info"]["num_people"] is not None]
if people_counts:
merged["num_people"] = max(people_counts)
# 合并列表类型的字段
list_fields = ["actions", "objects", "furniture", "emotions", "feature"]
for field in list_fields:
for result in frame_results:
merged[field].update(result["extracted_info"][field])
# 将集合转换回列表
for field in list_fields:
merged[field] = list(merged[field])
return merged
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
if output_path is None:
output_path = os.getcwd()
elif not os.path.exists(output_path):
os.makedirs(output_path)
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
results[video_file] = {"video_analysis": {}, "image_analysis": {}}
# 1. 使用 Qwen 处理视频
print(f"使用 Qwen 处理视频: {video_file}")
qwen_result = system.process_with_qwen(video_data, video_file, media_type='video')
results[video_file]["video_analysis"]["qwen"] = {
"original_answer": qwen_result["original_answer"],
"extracted_info": qwen_result["extracted_info"]
}
# 2. 使用 MiniCPM 处理视频
print(f"使用 MiniCPM 处理视频: {video_file}")
minicpm_result = system.process_with_minicpm(video_data, video_file, media_type='video')
results[video_file]["video_analysis"]["minicpm"] = {
"original_answer": minicpm_result["original_answer"],
"extracted_info": minicpm_result["extracted_info"]
}
# 3. 从视频中提取帧,使用 Llama 处理
frames = system.encode_video(video_data)
if frames:
print(f"使用 Llama 处理视频帧: {video_file}")
llama_result = system.process_with_llama(video_data, video_file, media_type='video')
results[video_file]["image_analysis"]["llama"] = {
"frame_results": llama_result["frame_results"],
"merged_result": llama_result["merged_result"],
"num_processed_frames": llama_result["num_processed_frames"]
}
else:
results[video_file]["image_analysis"]["error"] = "无法提取视频帧"
# 添加视频帧数信息
results[video_file]["video_analysis"]["num_frames"] = len(frames) if frames else 0
# 添加时间戳
results[video_file]["timestamp"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# 实时保存当前结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
# 每个视频处理完后清理内存
if torch.cuda.is_available():
torch.cuda.empty_cache()
import gc
gc.collect()
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {"error": str(e)}
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-328
View File
@@ -1,328 +0,0 @@
import io
import os
import json
import base64
import requests
import re
from PIL import Image
from datetime import datetime, timedelta
from decord import VideoReader, cpu
SILICONFLOW_URL = "https://api.siliconflow.cn/v1/chat/completions"
API_KEY = "sk-ytxabphvgxrjbvnqiwercjyrabvlukwddqsmvnqnvwuazamd"
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 5 # 最大帧数设为10
self.MIN_NUM_FRAMES = 3 # 最小帧数设为3
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file, ctx=cpu(0))
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
# 确保帧数在3-10之间
num_frames = min(max(3, len(frame_idx)), self.MAX_NUM_FRAMES)
if len(frame_idx) > num_frames:
frame_idx = uniform_sample(frame_idx, num_frames)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
# 压缩图片尺寸和质量
compressed_frames = []
for frame in frames:
# 保持宽高比的情况下调整大小
frame.thumbnail((600, 600), Image.Resampling.LANCZOS)
buffered = io.BytesIO()
frame.save(buffered, format="JPEG", quality=85)
compressed_frames.append(Image.open(buffered))
print(f'处理后的帧数: {len(compressed_frames)}')
return compressed_frames
def process_video(self, video_data, object_name):
if not video_data:
raise ValueError(f"Empty video data for {object_name}")
print(f"Processing video: {object_name}, data size: {len(video_data)} bytes")
frames = self.encode_video(video_data)
# 构建单个请求的消息内容
messages = [{
"role": "user",
"content": [
{
"type": "text",
"text": """请将这些图片作为一个时间序列进行详细分析,包括以下方面:
1. 场景中人数的精确统计
2. 每个人的个人行为分析
3. 面部表情识别和情绪状态评估
4. 整体场景和环境的详细描述
5. 人与人之间的互动情况
6. 详细的环境条件描述
7. 环境中出现的物品和家具
8. 任何可疑或异常活动
9. 人员的具体特征(估计年龄范围、性别、着装)
10. 人员的移动模式和方向
11. 携带的物品或物体
12. 群体动态和聚集情况
13. 视频中的时间戳分析(如果有)"""
}
]
}]
# 一次性添加所有图片到消息内容
for frame in frames:
base64_image = self.image_to_base64(frame)
messages[0]["content"].append({
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
"detail": "auto"
}
})
try:
response = self._make_api_request(messages)
answer = response["choices"][0]["message"]["content"]
extracted_info = self.extract_info(answer)
return {
"original_answer": answer,
"extracted_info": extracted_info,
"num_frames": len(frames),
}
except Exception as e:
print(f"API请求失败: {str(e)}")
raise
def _make_api_request(self, messages):
payload = {
"model": "deepseek-ai/deepseek-vl2",
"messages": messages,
"stream": False,
"max_tokens": 1024,
"temperature": 0.7,
"top_p": 0.7,
"top_k": 50,
"frequency_penalty": 0.5,
"n": 1,
"response_format": {"type": "text"}
}
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
response = requests.post(
SILICONFLOW_URL,
json=payload,
headers=headers,
timeout=60 # 增加超时时间到60秒
)
if response.status_code != 200:
raise Exception(f"Siliconflow API 错误: {response.status_code}")
return response.json()
@staticmethod
def image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="PNG")
return base64.b64encode(buffered.getvalue()).decode()
@staticmethod
def extract_time_from_filename(object_name):
filename = os.path.basename(object_name)
time_str = filename.split('_')[0] + '_' + filename.split('_')[1].split('.')[0]
try:
start_time = datetime.strptime(time_str, "%Y%m%d_%H%M%S")
end_time = start_time + timedelta(seconds=10)
return start_time, end_time
except ValueError:
print(f"无法从文件名 '{filename}' 解析时间。使用默认时间。")
return datetime.now(), datetime.now() + timedelta(seconds=10)
@staticmethod
def extract_info(answer):
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
environments = ["办公室", "室内", "室外", "会议室"]
for env in environments:
if env in answer.lower():
info["environment"] = env
break
people_patterns = [
r'(\d+)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一|二|三|四|五|六|七|八|九|十)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一个|几个)\s*(人|个人|员工|用户|小朋友|成年人|女性|男性)',
r'\s*(名|位)\s*(人|员工|用户|小朋友|成年人|女性|男性)?',
r'(男|女)(性|生|士)',
r'(成年|未成年|青少年|老年)\s*(人|群体)',
r'(员工|职工|工人|学生|顾客|观众|游客|乘客)',
r'(群众|民众|大众|公众)',
r'(男女|老少|老幼|大人|小孩)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['一个', '']:
info["num_people"] = 1
else:
num_word_to_digit = {
'': 2, '': 3, '': 4, '': 5,
'': 6, '': 7, '': 8, '': 9, '': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
actions = ["睡眠","", "","", "摔倒", "跳舞", "","蹲下","转身", "", "", "倒下", "躺下", "转身", "","跳跃", "", "", "", "说话","睡觉","起床","看书","写字","学习","玩手机","吃饭","搬东西","看风景","走路","散步","","阅读","写作","使用手机","使用电脑","学习","工作","使用笔记本电脑","吃饭","喝水","整理"]
for action in actions:
if action in answer:
info["actions"].append(action)
emotions = ["高兴", "愤怒", "悲伤", "惊讶", "恐惧", "厌恶", "平静","放松","中性","专注","思考"]
objects = ["水瓶", "办公用品", "文件", "电脑","风扇","鼠标","键盘","纸巾","","","袋子","盒子","水杯","杯子","马克杯","玻璃杯","文件夹","书包","书架","文件柜","手机"]
furniture = ["椅子", "桌子", "咖啡桌", "文件柜", "", "沙发","柜子","架子","摄像头","靠垫","办公椅","电视","白板","显示器","置物架","文件架"]
features = ["戴眼镜","不戴眼镜","长发","短发","长头发","短头发","戴帽子","不戴帽子","戴口罩","不戴口罩","男性","女性","","","","","","","成年人"]
for obj in objects:
if obj in answer:
info["objects"].append(obj)
for item in furniture:
if item in answer:
info["furniture"].append(item)
for feature in features:
if feature in answer:
info["features"].append(feature)
for emotion in emotions:
if emotion in answer:
info["emotions"].append(emotion)
return info
# 初始化 MediaAnalysisSystem
media_analysis_system = MediaAnalysisSystem()
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
# 支持的视频格式
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
# 确保文件夹存在
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
# 设置输出路径
if output_path is None:
output_path = os.getcwd() # 如果未指定,使用当前目录
elif not os.path.exists(output_path):
os.makedirs(output_path) # 如果输出目录不存在,创建它
# 获取所有视频文件
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
# 生成输出文件名
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
# 处理每个视频文件并实时保存结果
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
result = system.process_video(video_data, video_file)
# 修改结果存储格式
results[video_file] = {
"video_analysis": {
"deepseek-vl2": result
}
}
# 实时保存当前结果到JSON文件
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {
"video_analysis": {
"deepseek-vl2": {"error": str(e)}
}
}
# 即使处理失败也保存当前结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-328
View File
@@ -1,328 +0,0 @@
import io
import os
import json
import base64
import requests
import re
from PIL import Image
from datetime import datetime, timedelta
from decord import VideoReader, cpu
SILICONFLOW_URL = "https://api.siliconflow.cn/v1/chat/completions"
API_KEY = "sk-ytxabphvgxrjbvnqiwercjyrabvlukwddqsmvnqnvwuazamd"
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 5 # 最大帧数设为10
self.MIN_NUM_FRAMES = 3 # 最小帧数设为3
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file, ctx=cpu(0))
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
# 确保帧数在3-10之间
num_frames = min(max(3, len(frame_idx)), self.MAX_NUM_FRAMES)
if len(frame_idx) > num_frames:
frame_idx = uniform_sample(frame_idx, num_frames)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
# 压缩图片尺寸和质量
compressed_frames = []
for frame in frames:
# 保持宽高比的情况下调整大小
frame.thumbnail((600, 600), Image.Resampling.LANCZOS)
buffered = io.BytesIO()
frame.save(buffered, format="JPEG", quality=85)
compressed_frames.append(Image.open(buffered))
print(f'处理后的帧数: {len(compressed_frames)}')
return compressed_frames
def process_video(self, video_data, object_name):
if not video_data:
raise ValueError(f"Empty video data for {object_name}")
print(f"Processing video: {object_name}, data size: {len(video_data)} bytes")
frames = self.encode_video(video_data)
# 构建单个请求的消息内容
messages = [{
"role": "user",
"content": [
{
"type": "text",
"text": """Please analyze these images as a time series in detail, including the following aspects:
1. Exact count of people in the scene
2. Individual behavior analysis for each person
3. Facial expression recognition and emotional state assessment
4. Overall scene and environment detailed description
5. Interactions between people
6. Detailed environmental conditions description
7. Items and furniture appearing in the environment
8. Any suspicious or abnormal activities
9. Personnel specific characteristics (estimated age range, gender, clothing)
10. Movement patterns and directions of people
11. Carried items or objects
12. Group dynamics and gathering situations
13. Video timestamp analysis (if available)"""
}
]
}]
# 一次性添加所有图片到消息内容
for frame in frames:
base64_image = self.image_to_base64(frame)
messages[0]["content"].append({
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
"detail": "auto"
}
})
try:
response = self._make_api_request(messages)
answer = response["choices"][0]["message"]["content"]
extracted_info = self.extract_info(answer)
return {
"original_answer": answer,
"extracted_info": extracted_info,
"num_frames": len(frames),
}
except Exception as e:
print(f"API请求失败: {str(e)}")
raise
def _make_api_request(self, messages):
payload = {
"model": "deepseek-ai/deepseek-vl2",
"messages": messages,
"stream": False,
"max_tokens": 1024,
"temperature": 0.7,
"top_p": 0.7,
"top_k": 50,
"frequency_penalty": 0.5,
"n": 1,
"response_format": {"type": "text"}
}
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
response = requests.post(
SILICONFLOW_URL,
json=payload,
headers=headers,
timeout=60 # 增加超时时间到60秒
)
if response.status_code != 200:
raise Exception(f"Siliconflow API 错误: {response.status_code}")
return response.json()
@staticmethod
def image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="PNG")
return base64.b64encode(buffered.getvalue()).decode()
@staticmethod
def extract_time_from_filename(object_name):
filename = os.path.basename(object_name)
time_str = filename.split('_')[0] + '_' + filename.split('_')[1].split('.')[0]
try:
start_time = datetime.strptime(time_str, "%Y%m%d_%H%M%S")
end_time = start_time + timedelta(seconds=10)
return start_time, end_time
except ValueError:
print(f"无法从文件名 '{filename}' 解析时间。使用默认时间。")
return datetime.now(), datetime.now() + timedelta(seconds=10)
@staticmethod
def extract_info(answer):
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
environments = ["office", "indoor", "outdoor", "meeting room"]
for env in environments:
if env.lower() in answer.lower():
info["environment"] = env
break
people_patterns = [
r'(\d+)\s*(person|people|individual|staff|user|child|adult|female|male)',
r'(one|two|three|four|five|six|seven|eight|nine|ten)\s*(person|people|individual|staff|user|child|adult|female|male)',
r'(a|few)\s*(person|people|individual|staff|user|child|adult|female|male)',
r'several\s*(person|people|individual|staff|user|child|adult|female|male)?',
r'(male|female)',
r'(adult|minor|youth|elderly)\s*(person|group)',
r'(employee|worker|student|customer|audience|visitor|passenger)',
r'(crowd|public|mass|people)',
r'(men|women|old|young|adult|child)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['a', 'one','an']:
info["num_people"] = 1
else:
num_word_to_digit = {
'two': 2, 'three': 3, 'four': 4, 'five': 5,
'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
actions = ["sleeping", "sitting", "eating", "standing", "falling", "dancing", "squatting", "crouching", "turning", "falling down", "lying down", "turning around", "jumping", "lying", "sleeping", "talking", "waking up", "reading", "writing", "studying", "using phone", "dining", "moving things", "sightseeing", "walking", "strolling", "reading", "writing", "using phone", "using computer", "studying", "working", "using laptop", "eating", "drinking", "organizing"]
for action in actions:
if action in answer:
info["actions"].append(action)
emotions = ["happy", "angry", "sad", "surprised", "scared", "disgusted", "calm", "relaxed", "neutral", "focused", "thinking"]
objects = ["water bottle", "office supplies", "documents", "computer", "fan", "mouse", "keyboard", "tissue", "book", "pen", "bag", "box", "water cup", "cup", "mug", "glass", "folder", "backpack", "bookshelf", "filing cabinet", "phone"]
furniture = ["chair", "desk", "coffee table", "filing cabinet", "bed", "sofa", "cabinet", "shelf", "camera", "cushion", "office chair", "TV", "whiteboard", "monitor", "storage rack", "file rack"]
features = ["wearing glasses", "not wearing glasses", "long hair", "short hair", "long hair", "short hair", "wearing hat", "not wearing hat", "wearing mask", "not wearing mask", "male", "female", "fat", "thin", "tall", "short", "man", "woman", "adult"]
for obj in objects:
if obj in answer:
info["objects"].append(obj)
for item in furniture:
if item in answer:
info["furniture"].append(item)
for feature in features:
if feature in answer:
info["features"].append(feature)
for emotion in emotions:
if emotion in answer:
info["emotions"].append(emotion)
return info
# 初始化 MediaAnalysisSystem
media_analysis_system = MediaAnalysisSystem()
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
# 支持的视频格式
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
# 确保文件夹存在
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
# 设置输出路径
if output_path is None:
output_path = os.getcwd() # 如果未指定,使用当前目录
elif not os.path.exists(output_path):
os.makedirs(output_path) # 如果输出目录不存在,创建它
# 获取所有视频文件
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
# 生成输出文件名
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
# 处理每个视频文件并实时保存结果
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
result = system.process_video(video_data, video_file)
# 修改结果存储格式
results[video_file] = {
"video_analysis": {
"deepseek-vl2": result
}
}
# 实时保存当前结果到JSON文件
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {
"video_analysis": {
"deepseek-vl2": {"error": str(e)}
}
}
# 即使处理失败也保存当前结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-297
View File
@@ -1,297 +0,0 @@
import os
import json
import time
from datetime import datetime
import redis
from deepface import DeepFace
import numpy as np
import gc
import re
class FaceAnalysisSystem:
def __init__(self):
# Redis配置
self.redis_clients = {
'A01': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=210
),
'B02': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=211
)
}
# 身份信息数据库
self.identity_db = redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=212
)
def get_face_embedding(self, img_path):
"""获取人脸embedding"""
try:
embedding_obj = DeepFace.represent(
img_path=img_path,
detector_backend="retinaface",
align=True,
model_name="Facenet512"
)
return embedding_obj[0]["embedding"] if embedding_obj else None
except Exception as e:
print(f"获取人脸embedding失败: {str(e)}")
return None
def find_identity(self, embedding):
"""在身份数据库中查找匹配的身份"""
try:
# 获取所有身份的embedding
all_identities = self.identity_db.keys("*")
best_match = None
best_similarity = -1
for identity_key in all_identities:
# 获取该身份的所有embedding
stored_data = json.loads(self.identity_db.get(identity_key))
# 如果存储的数据是列表(多个embedding)
if isinstance(stored_data, list):
# 对该身份的每个embedding进行比对
for face_data in stored_data:
stored_vector = np.array(face_data["embedding"])
# 计算余弦相似度
similarity = np.dot(embedding, stored_vector) / (
np.linalg.norm(embedding) * np.linalg.norm(stored_vector)
)
if similarity > best_similarity:
best_similarity = similarity
best_match = identity_key.decode()
# 如果相似度大于阈值,返回身份信息,否则返回unknown
if best_similarity > 0.72: # 可以调整阈值
return best_match, best_similarity
return "unknown", best_similarity
except Exception as e:
print(f"查找身份时出错: {str(e)}")
return "unknown", -1
class ImageMonitor:
def __init__(self, images_path):
self.images_path = images_path
self.system = FaceAnalysisSystem()
self.processed_images = set()
self.error_images = []
self.error_image_cache = set()
def _get_redis_key(self, image_path):
"""生成Redis键值"""
try:
dir_name = os.path.basename(os.path.dirname(image_path))
file_name = os.path.basename(image_path)
# 从图片文件名中提取日期和时间
# 假设文件名格式: A01_20250105_134104.jpg
match = re.search(r'(\w+)_(\d{8})_(\d{2})\d{4}\.(jpg|png)', file_name)
if match:
camera_id = match.group(1)
date = match.group(2)
hour = match.group(3)
# 生成key: A01_20250105_1300
redis_key = f"face_{camera_id}_{date}_{hour}00"
return redis_key
print(f"文件名格式不匹配: {file_name}")
return None
except Exception as e:
print(f"生成Redis key失败: {str(e)}")
return None
def _is_processed(self, image_path):
"""检查图片是否已处理"""
return image_path in self.processed_images
def _is_error_cached(self, image_path):
"""检查图片是否在错误缓存中"""
return image_path in self.error_image_cache
def _add_to_error_cache(self, image_path):
"""添加图片到错误缓存"""
self.error_image_cache.add(image_path)
def _log_error(self, image_path, error_type, error_message):
"""记录错误信息"""
if self._is_error_cached(image_path):
return
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
error_info = {
"timestamp": current_time,
"image_path": image_path,
"error_type": error_type,
"error_message": error_message,
"file_size": os.path.getsize(image_path) if os.path.exists(image_path) else 0
}
self.error_images.append(error_info)
self._add_to_error_cache(image_path)
def _save_error_log(self):
"""保存错误日志"""
if not self.error_images:
return
try:
current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
log_filename = f"image_errors_{current_time}.json"
with open(log_filename, 'w', encoding='utf-8') as f:
json.dump(self.error_images, f, ensure_ascii=False, indent=2)
print(f"\n异常图片记录已保存到: {log_filename}")
self.error_images = []
except Exception as e:
print(f"保存错误日志失败: {str(e)}")
def process_new_image(self, image_path):
"""处理新图片"""
try:
if self._is_error_cached(image_path):
return False
file_name = os.path.basename(image_path)
if self._is_processed(image_path):
print(f"图片已处理过,跳过: {file_name}")
return True
redis_key = self._get_redis_key(image_path)
if not redis_key:
self._log_error(image_path, "Redis Key Error", "无法生成Redis key")
return False
if not os.path.exists(image_path):
self._log_error(image_path, "File Not Found", "图片文件不存在")
return False
# 检查文件大小
file_size = os.path.getsize(image_path)
if file_size == 0:
self._log_error(image_path, "Empty File", "图片文件大小为0")
return False
elif file_size < 10 * 1024: # 小于10KB
self._log_error(image_path, "Small File", f"图片文件大小异常({file_size/1024:.2f}KB")
return False
# 获取人脸embedding
embedding = self.system.get_face_embedding(image_path)
if embedding is None:
self._log_error(image_path, "Face Detection Error", "无法检测到人脸或提取特征")
return False
# 查找身份
identity, similarity = self.system.find_identity(embedding)
# 从文件名提取时间戳
timestamp_match = re.search(r'(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})', file_name)
if timestamp_match:
year, month, day, hour, minute, second = timestamp_match.groups()
timestamp = f"{year}-{month}-{day} {hour}:{minute}:{second}"
else:
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# 准备结果数据
result = {
"face_analysis": {
"identity": identity,
"similarity": float(similarity)
},
"timestamp": timestamp
}
# 保存到Redis
dir_name = os.path.basename(os.path.dirname(image_path))
if dir_name in self.system.redis_clients:
redis_client = self.system.redis_clients[dir_name]
existing_data = redis_client.get(redis_key)
if existing_data:
hour_results = json.loads(existing_data)
hour_results[file_name] = result
else:
hour_results = {file_name: result}
json_str = json.dumps(hour_results, ensure_ascii=False)
redis_client.set(redis_key, json_str)
print(f"成功保存到Rediskey: {redis_key}")
self.processed_images.add(image_path)
return True
except Exception as e:
self._log_error(image_path, "Processing Error", str(e))
print(f"处理图片时发生错误 {image_path}: {str(e)}")
return False
finally:
gc.collect()
def monitor_directories(self):
"""监控目录变化"""
try:
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"开始监控目录: {self.images_path} [{current_time}]")
while True:
try:
for camera_dir in os.listdir(self.images_path):
camera_path = os.path.join(self.images_path, camera_dir)
if not os.path.isdir(camera_path):
continue
for image_file in os.listdir(camera_path):
if not image_file.lower().endswith(('.jpg', '.jpeg', '.png')):
continue
image_path = os.path.join(camera_path, image_file)
if not self._is_processed(image_path) and not self._is_error_cached(image_path):
print(f"处理图片: {image_path}")
if not self.process_new_image(image_path):
self._add_to_error_cache(image_path)
print(f"图片处理失败,已加入错误缓存: {image_path}")
continue
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{current_time}] 等待新图片中...")
time.sleep(60) # 每分钟检查一次
except Exception as e:
print(f"监控过程出错: {str(e)}")
time.sleep(10)
except KeyboardInterrupt:
print("\n检测到程序终止信号,正在保存错误日志...")
self._save_error_log()
print("程序已安全终止。")
except Exception as e:
print(f"\n程序异常终止: {str(e)}")
self._save_error_log()
raise
def main():
try:
images_path = "/home/zydi/VLM/images" # 设置images目录路径
monitor = ImageMonitor(images_path)
monitor.monitor_directories()
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-291
View File
@@ -1,291 +0,0 @@
import io
import os
import json
import base64
import requests
import re
from PIL import Image
from datetime import datetime, timedelta
from decord import VideoReader, cpu
OLLAMA_URL = "http://127.0.0.1:11434/api/generate"
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 16
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file, ctx=cpu(0))
# 修改采样逻辑,随机选择3-8帧
num_frames = min(max(3, len(vr) // 30), 8) # 确保至少3帧,最多8帧
frame_idx = uniform_sample(range(len(vr)), num_frames)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
print('采样帧数:', len(frames))
return frames
def process_video(self, video_data, object_name):
if not video_data:
raise ValueError(f"Empty video data for {object_name}")
print(f"处理视频: {object_name}, 数据大小: {len(video_data)} bytes")
frames = self.encode_video(video_data)
all_responses = []
# 逐帧分析
for i, frame in enumerate(frames):
print(f"Analyzing frame {i+1}/{len(frames)}...")
question = """Please provide a detailed analysis of this surveillance image, including the following aspects:
1. Precise count of people in the scene
2. Individual behavior analysis of each person
3. Facial expression recognition and emotional state assessment
4. Detailed description of overall scene and environment
5. Interactions between people
6. Detailed description of environmental conditions
7. Items and furniture present in the environment
8. Any suspicious or unusual activities
9. Specific characteristics of people (estimated age range, gender, clothing)
10. Movement patterns and directions of people
11. Carried items or objects
12. Group dynamics and gathering situations
13. Analysis of video timestamp (if present)
Please describe in a clear, organized format and highlight important findings."""
payload = {
"model": "llama3.2-vision", # 使用llama2 13b模型
"prompt": question,
"images": [self.image_to_base64(frame)] # 每次只发送一张图片
}
try:
response = requests.post(OLLAMA_URL, json=payload, stream=True)
if response.status_code == 200:
frame_answer = self.process_stream_response(response)
all_responses.append(frame_answer)
else:
raise Exception(f"Ollama API 错误: {response.status_code}")
except requests.RequestException as e:
print(f"请求 Ollama API 时出错: {str(e)}")
raise
# 合并所有帧的分析结果
combined_answer = "\n\n=== 视频总体分析 ===\n".join(all_responses)
extracted_info = self.extract_info(combined_answer)
return {
"original_answer": combined_answer,
"extracted_info": extracted_info,
"num_frames": len(frames),
}
def process_stream_response(self, response):
full_response = []
for line in response.iter_lines():
if line:
try:
json_response = json.loads(line)
if 'response' in json_response:
full_response.append(json_response['response'])
if json_response.get('done', False):
break
except json.JSONDecodeError:
print(f"无法解析 JSON 行: {line}")
return ''.join(full_response)
@staticmethod
def image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="PNG")
return base64.b64encode(buffered.getvalue()).decode()
@staticmethod
def extract_time_from_filename(object_name):
filename = os.path.basename(object_name)
time_str = filename.split('_')[0] + '_' + filename.split('_')[1].split('.')[0]
try:
start_time = datetime.strptime(time_str, "%Y%m%d_%H%M%S")
end_time = start_time + timedelta(seconds=10)
return start_time, end_time
except ValueError:
print(f"无法从文件名 '{filename}' 解析时间。使用默认时间。")
return datetime.now(), datetime.now() + timedelta(seconds=10)
@staticmethod
def extract_info(answer):
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
environments = ["office", "indoor", "outdoor", "meeting room", "room", "classroom", "living room", "bedroom", "kitchen", "bathroom", "hallway", "corridor"]
for env in environments:
if env in answer.lower():
info["environment"] = env
break
people_patterns = [
r'(\d+)\s*(person|people|individual|employee|user|child|adult|female|male)',
r'(one|two|three|four|five|six|seven|eight|nine|ten)\s*(person|people|individual|employee|user|child|adult|female|male)',
r'(a|few)\s*(person|people|employee|user|child|adult|female|male)',
r'several\s*(person|people|employee|user|child|adult|female|male)?',
r'(male|female)',
r'(adult|minor|teenager|elderly)\s*(person|group)',
r'(employee|worker|student|customer|audience|visitor|passenger)',
r'(crowd|public|people|mass)',
r'(men|women|adults|children)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['a', 'one',"an"]:
info["num_people"] = 1
else:
num_word_to_digit = {
'two': 2, 'three': 3, 'four': 4, 'five': 5,
'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
actions = ["sleeping", "sitting", "drinking", "eating", "standing", "falling", "dancing", "squatting", "squat", "turning", "fall", "falling down", "lying down", "turning around", "turn", "jumping", "jump", "lying", "sleep", "talking", "sleeping", "getting up", "reading", "writing", "studying", "phone", "eating", "moving things", "sightseeing", "walking", "strolling", "walk", "reading", "writing", "using phone", "computer", "studying", "working", "laptop", "eating", "drinking", "organizing"]
for action in actions:
if action in answer:
info["actions"].append(action)
emotions = ["happy", "angry", "sad", "surprised", "scared", "disgusted", "calm", "relaxed", "neutral", "focused", "thinking"]
objects = ["water bottle", "office supplies", "documents", "computer", "fan", "mouse", "keyboard", "tissue", "book", "pen", "bag", "box", "water cup", "cup", "mug", "glass", "folder", "backpack", "bookshelf", "file cabinet", "phone"]
furniture = ["chair", "table", "coffee table", "file cabinet", "bed", "sofa", "cabinet", "shelf", "camera", "cushion", "office chair", "TV", "whiteboard", "monitor", "storage rack", "file rack"]
features = ["wearing glasses", "not wearing glasses", "long hair", "short hair", "wearing hat", "not wearing hat", "wearing mask", "not wearing mask", "male", "female", "fat", "thin", "tall", "short", "man", "woman", "adult"]
for obj in objects:
if obj in answer:
info["objects"].append(obj)
for item in furniture:
if item in answer:
info["furniture"].append(item)
for feature in features:
if feature in answer:
info["features"].append(feature)
for emotion in emotions:
if emotion in answer:
info["emotions"].append(emotion)
return info
# 初始化 MediaAnalysisSystem
media_analysis_system = MediaAnalysisSystem()
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
# 支持的视频格式
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
# 确保文件夹存在
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
# 设置输出路径
if output_path is None:
output_path = os.getcwd() # 如果未指定,使用当前目录
elif not os.path.exists(output_path):
os.makedirs(output_path) # 如果输出目录不存在,创建它
# 获取所有视频文件
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
# 生成输出文件名
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
# 处理每个视频文件并实时保存结果
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
result = system.process_video(video_data, video_file)
# 修改结果存储格式
results[video_file] = {
"video_analysis": {
"llama3.2-vision": result
}
}
# 实时保存当前结果到JSON文件
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {
"video_analysis": {
"llama3.2-vision": {"error": str(e)}
}
}
# 即使处理失败也保存当前结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-305
View File
@@ -1,305 +0,0 @@
import io
import os
import json
import base64
import requests
import re
from PIL import Image
from datetime import datetime, timedelta
from decord import VideoReader, cpu
OLLAMA_URL = "http://127.0.0.1:11434/api/generate"
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 16
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file, ctx=cpu(0))
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
if len(frame_idx) > self.MAX_NUM_FRAMES:
frame_idx = uniform_sample(frame_idx, self.MAX_NUM_FRAMES)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
print('num frames:', len(frames))
return frames
def process_video(self, video_data, object_name):
if not video_data:
raise ValueError(f"Empty video data for {object_name}")
print(f"Processing video: {object_name}, data size: {len(video_data)} bytes")
frames = self.encode_video(video_data)
question = """你是一位视频描述专家,你擅长对视频进行详细的描述,请对这段监控视频进行详细分析,包括以下方面,并按照下面格式回答:
1. 环境场景
- 整体场景描述(室内/室外、光线条件等)
- 主要物品和家具列表
- 环境特征(如光线、整洁度等)
2. 人员统计
- 总人数:[数字]人
- 性别分布:[男性数量]/[女性数量]
(若无法确定准确人数,请注明"无法确定人数"
3. 人员特征分析
- 个人特征:性别、年龄段、着装、体态等
- 携带物品:详细描述随身物品及用途
- 表情/情绪状态
4. 行为分析
- 个人行为:移动方向、姿态、动作等
- 互动情况:人员之间的交互描述(若多人)
- 活动区域:人员活动的主要位置
5. 群体行为(若多人)
- 聚集形态
- 移动趋势
- 群体互动特点
6. 异常情况
- 可疑行为描述
- 异常活动标记
- 需要注意的安全隐患
请用清晰、有条理的格式描述,并突出重要发现。"""
encoded_frames = [self.image_to_base64(frame) for frame in frames]
payload = {
"model": "minicpm-v",
"prompt": question,
"images": encoded_frames
}
try:
response = requests.post(OLLAMA_URL, json=payload, stream=True)
print(f"Ollama API 响应状态码: {response.status_code}")
print(f"Ollama API 响应头: {response.headers}")
if response.status_code == 200:
answer = self.process_stream_response(response)
else:
raise Exception(f"Ollama API 错误: {response.status_code}")
except requests.RequestException as e:
print(f"请求 Ollama API 时出错: {str(e)}")
raise
extracted_info = self.extract_info(answer)
return {
"original_answer": answer,
"extracted_info": extracted_info,
"num_frames": len(frames),
}
def process_stream_response(self, response):
full_response = []
for line in response.iter_lines():
if line:
try:
json_response = json.loads(line)
if 'response' in json_response:
full_response.append(json_response['response'])
if json_response.get('done', False):
break
except json.JSONDecodeError:
print(f"无法解析 JSON 行: {line}")
return ''.join(full_response)
@staticmethod
def image_to_base64(image):
buffered = io.BytesIO()
image.save(buffered, format="PNG")
return base64.b64encode(buffered.getvalue()).decode()
@staticmethod
def extract_time_from_filename(object_name):
filename = os.path.basename(object_name)
time_str = filename.split('_')[0] + '_' + filename.split('_')[1].split('.')[0]
try:
start_time = datetime.strptime(time_str, "%Y%m%d_%H%M%S")
end_time = start_time + timedelta(seconds=10)
return start_time, end_time
except ValueError:
print(f"无法从文件名 '{filename}' 解析时间。使用默认时间。")
return datetime.now(), datetime.now() + timedelta(seconds=10)
@staticmethod
def extract_info(answer):
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
environments = ["办公室", "室内", "室外", "会议室", "房间", "教室", "客厅", "卧室", "厨房", "浴室", "走廊", "过道"]
for env in environments:
if env in answer.lower():
info["environment"] = env
break
people_patterns = [
r'(\d+)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一|二|三|四|五|六|七|八|九|十)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一个|几个)\s*(人|个人|员工|用户|小朋友|成年人|女性|男性)',
r'\s*(名|位)\s*(人|员工|用户|小朋友|成年人|女性|男性)?',
r'(男|女)(性|生|士)',
r'(成年|未成年|青少年|老年)\s*(人|群体)',
r'(员工|职工|工人|学生|顾客|观众|游客|乘客)',
r'(群众|民众|大众|公众)',
r'(男女|老少|老幼|大人|小孩)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['一个', '']:
info["num_people"] = 1
else:
num_word_to_digit = {
'': 2, '': 3, '': 4, '': 5,
'': 6, '': 7, '': 8, '': 9, '': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
actions = ["睡眠","", "","","", "摔倒", "跳舞", "","蹲下","转身", "", "", "倒下", "躺下", "转身", "","跳跃", "", "", "", "说话","睡觉","起床","看书","写字","学习","玩手机","吃饭","搬东西","看风景","走路","散步","","阅读","写作","使用手机","使用电脑","学习","工作","使用笔记本电脑","吃饭","喝水","整理"]
for action in actions:
if action in answer:
info["actions"].append(action)
emotions = ["高兴", "愤怒", "悲伤", "惊讶", "恐惧", "厌恶", "平静","放松","中性","专注","思考"]
objects = ["水瓶", "办公用品", "文件", "电脑","风扇","鼠标","键盘","纸巾","","","袋子","盒子","水杯","杯子","马克杯","玻璃杯","文件夹","书包","书架","文件柜","手机"]
furniture = ["椅子", "桌子", "咖啡桌", "文件柜", "", "沙发","柜子","架子","摄像头","靠垫","办公椅","电视","白板","显示器","置物架","文件架"]
features = ["戴眼镜","不戴眼镜","长发","短发","长头发","短头发","戴帽子","不戴帽子","戴口罩","不戴口罩","男性","女性","","","","","","","成年人"]
for obj in objects:
if obj in answer:
info["objects"].append(obj)
for item in furniture:
if item in answer:
info["furniture"].append(item)
for feature in features:
if feature in answer:
info["features"].append(feature)
for emotion in emotions:
if emotion in answer:
info["emotions"].append(emotion)
return info
# 初始化 MediaAnalysisSystem
media_analysis_system = MediaAnalysisSystem()
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
# 支持的视频格式
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
# 确保文件夹存在
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
# 设置输出路径
if output_path is None:
output_path = os.getcwd() # 如果未指定,使用当前目录
elif not os.path.exists(output_path):
os.makedirs(output_path) # 如果输出目录不存在,创建它
# 获取所有视频文件
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
# 生成输出文件名
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
# 处理每个视频文件并实时保存结果
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
result = system.process_video(video_data, video_file)
# 修改结果存储格式
results[video_file] = {
"video_analysis": {
"minicpm": result
}
}
# 实时保存当前结果到JSON文件
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {
"video_analysis": {
"minicpm": {"error": str(e)}
}
}
# 即使处理失败也保存当前结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-554
View File
@@ -1,554 +0,0 @@
import os
import json
import torch
from datetime import datetime
from PIL import Image
import io
import re
from decord import VideoReader
from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import redis
import time
import gc
# 配置
QWEN_MODEL_PATH = "/obscura/models/qwen/Qwen2-VL-7B-Instruct"
# 初始化 Qwen 模型 (使用 cuda:0)
print("正在初始化 Qwen 模型 (cuda:0)...")
model = Qwen2VLForConditionalGeneration.from_pretrained(
QWEN_MODEL_PATH,
torch_dtype="auto",
device_map="cuda:0"
)
min_pixels = 128*28*28
max_pixels = 256*28*28
processor = AutoProcessor.from_pretrained(
QWEN_MODEL_PATH,
min_pixels=min_pixels,
max_pixels=max_pixels
)
# 在文件开头添加加载配置的代码
def load_config():
"""加载配置文件"""
try:
with open('info.json', 'r', encoding='utf-8') as f:
config = json.load(f)
return config
except Exception as e:
print(f"加载配置文件失败: {e}")
return {"actions": [], "environments": []}
# 加载配置
CONFIG = load_config()
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 10
self.device = "cuda:0"
self.qwen_model = model
self.qwen_processor = processor
# 使用加载的配置
self.environments = CONFIG["environments"]
self.actions = CONFIG["actions"]
self.emotions = [
"钦佩", "赞赏", "欣赏","关心", "高兴", "", "乐观", "感激", "释然", "骄傲", "愉悦",
"愤怒", "烦恼", "焦虑", "尴尬", "失望", "厌恶", "恐惧", "悲伤", "懊悔", "羞耻","发呆",
"困惑", "好奇", "欲望", "惊讶", "实事求是", "中性", "赞叹","平静","放松","专注","思考"
]
self.objects = [
"办公桌椅","文件柜","打印机","饮水机","装饰植物","书架","储物柜","水瓶", "办公用品", "文件", "电脑","风扇","鼠标","键盘","纸巾","","","袋子","盒子","水杯","杯子","马克杯","玻璃杯","文件夹","书包","书架","手机"
]
self.furniture = [
"椅子", "桌子", "咖啡桌", "文件柜", "", "沙发","柜子","架子","摄像头","靠垫","办公椅","电视","白板","显示器","置物架","文件架"
]
self.features = [
"戴眼镜","不戴眼镜","长发","短发","长头发","短头发","戴帽子","不戴帽子","戴口罩","不戴口罩","男性","女性","","","","","","","成年人"
]
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file)
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
if len(frame_idx) > self.MAX_NUM_FRAMES:
frame_idx = uniform_sample(frame_idx, self.MAX_NUM_FRAMES)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
print('num frames:', len(frames))
return frames
def process_with_qwen(self, media_data, object_name, media_type='image'):
"""使用 Qwen 模型处理媒体"""
if media_type == 'video':
frames = self.encode_video(media_data)
media_content = {"type": "video", "video": frames, "fps": 1.0}
else:
image = Image.open(io.BytesIO(media_data))
media_content = {"type": "image", "image": image}
messages = [
{
"role": "user",
"content": [
media_content,
{"type": "text", "text": self._get_analysis_prompt(media_type)}
],
}
]
text = self.qwen_processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = self.qwen_processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to(self.device)
generated_ids = self.qwen_model.generate(**inputs, max_new_tokens=2048)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
answer = self.qwen_processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)[0]
return {
"model": "qwen",
"original_answer": answer,
"extracted_info": self.extract_info(answer)
}
def _get_analysis_prompt(self, media_type):
"""获取分析提示词"""
return f"""你是一位视频描述专家,你擅长对视频进行详细的描述,请对这段监控视频进行详细分析,包括以下方面,并按照下面格式回答:
1. 环境场景
- 整体场景描述(室内/室外、光线条件等)
- 主要物品和家具列表
- 环境特征(如光线、整洁度等)
2. 人员统计
- 总人数:[数字]人
- 性别分布:[男性数量]/[女性数量]
(若无法确定准确人数,请注明"无法确定人数"
3. 人员特征分析
- 个人特征:性别、年龄段、着装、体态等
- 携带物品:详细描述随身物品及用途
- 表情/情绪状态
4. 行为分析
- 个人行为:移动方向、姿态、动作等
- 互动情况:人员之间的交互描述(若多人)
- 活动区域:人员活动的主要位置
5. 群体行为(若多人)
- 聚集形态
- 移动趋势
- 群体互动特点
6. 异常情况
- 可疑行为描述
- 异常活动标记
- 需要注意的安全隐患
请用清晰、有条理的格式描述,并突出重要发现。"""
def extract_info(self, answer):
"""提取中文信息"""
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
# 将回答按章节分割
sections = {}
current_section = None
for line in answer.split('\n'):
if line.startswith('###'):
current_section = line.strip('# ').lower()
sections[current_section] = []
elif current_section and line.strip():
sections[current_section].append(line.strip())
# 从"行为分析"部分提取动作
if '行为分析' in sections:
behavior_text = ' '.join(sections['行为分析'])
# 使用加载的动作列表
for action in self.actions:
if action in behavior_text:
if action not in info["actions"]: # 避免重复
info["actions"].append(action)
# 从"环境场景"部分提取物品和家具
if '环境场景' in sections:
scene_text = ' '.join(sections['环境场景'])
for obj in self.objects: # 假设已将objects移到类属性
if obj in scene_text:
if obj not in info["objects"]:
info["objects"].append(obj)
for item in self.furniture: # 假设已将furniture移到类属性
if item in scene_text:
if item not in info["furniture"]:
info["furniture"].append(item)
# 从"人员特征分析"部分提取特征和情绪
if '人员特征分析' in sections:
feature_text = ' '.join(sections['人员特征分析'])
for feature in self.features: # 假设已将features移到类属性
if feature in feature_text:
if feature not in info["features"]:
info["features"].append(feature)
for emotion in self.emotions: # 假设已将emotions移到类属性
if emotion in feature_text:
if emotion not in info["emotions"]:
info["emotions"].append(emotion)
# 中文数字模式
people_patterns = [
r'(\d+)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一|二|三|四|五|六|七|八|九|十)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一个|几个)\s*(人|个人|员工|用户|小朋友|成年人|女性|男性)',
r'\s*(名|位)\s*(人|员工|用户|小朋友|成年人|女性|男性)?',
r'(男|女)(性|生|士)',
r'(成年|未成年|青少年|老年)\s*(人|群体)',
r'(员工|职工|工人|学生|顾客|观众|游客|乘客)',
r'(群众|民众|大众|公众)',
r'(男女|老少|老幼|大人|小孩)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['一个', '']:
info["num_people"] = 1
else:
num_word_to_digit = {
'': 2, '': 3, '': 4, '': 5,
'': 6, '': 7, '': 8, '': 9, '': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
return info
def process_video_folder(system, folder_path):
"""处理文件夹中的所有视频文件并保存到Redis"""
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
# 创建VideoMonitor实例用于Redis操作
monitor = VideoMonitor(folder_path, system)
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
# 使用VideoMonitor的process_new_video方法处理并保存到Redis
monitor.process_new_video(video_path)
print(f"✓ 成功处理并保存到Redis: {video_file}")
# 清理内存
if torch.cuda.is_available():
torch.cuda.empty_cache()
import gc
gc.collect()
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
print(f"\n所有视频处理完成")
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
# 在 MediaAnalysisSystem 类后添加新的监听类
class VideoMonitor:
def __init__(self, recordings_path, system):
self.recordings_path = recordings_path
self.system = system
self.redis_clients = {
'A01': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=210
),
'B02': redis.Redis(
host="222.186.10.253",
port=6379,
password="Obscura@2024",
db=211
)
}
# 新增:初始化时加载已处理的视频记录
self.processed_videos = self._load_processed_videos()
def _load_processed_videos(self):
"""从Redis加载所有已处理的视频文件名"""
processed_videos = set()
try:
for camera_id, redis_client in self.redis_clients.items():
# 获取所有小时级别的键
for key in redis_client.keys('*'):
key_str = key.decode('utf-8')
# 只获取键中存储的文件名列表,而不是完整的处理结果
data = redis_client.get(key)
if data:
hour_results = json.loads(data)
# 只添加文件名到集合中
processed_videos.update(hour_results.keys())
print(f"已从Redis加载 {len(processed_videos)} 个已处理文件记录")
return processed_videos
except Exception as e:
print(f"加载Redis处理记录时出错: {str(e)}")
return set()
def _get_redis_key(self, video_path):
try:
# 从路径获取摄像头ID (目录名)
dir_name = os.path.basename(os.path.dirname(video_path))
file_name = os.path.basename(video_path) # 例如:A01_20250105_134104.avi
# 从视频文件名中提取日期和时间
match = re.search(r'(\w+)_(\d{8})_(\d{2})\d{4}\.avi', file_name)
if match:
camera_id = match.group(1) # A01
date = match.group(2) # 20250105
hour = match.group(3) # 13 (从134104中提取)
# 生成正确的key: A01_20250105_1300
redis_key = f"{camera_id}_{date}_{hour}00"
return redis_key
print(f"文件名格式不匹配: {file_name}")
return None
except Exception as e:
print(f"生成Redis key失败: {str(e)}")
return None
def _is_processed(self, video_path):
"""检查视频是否已处理"""
file_name = os.path.basename(video_path)
return file_name in self.processed_videos
def process_new_video(self, video_path):
try:
# 处理前清理
if torch.cuda.is_available():
torch.cuda.empty_cache()
gc.collect()
file_name = os.path.basename(video_path)
# 检查是否已处理
if self._is_processed(video_path):
print(f"视频已处理过,跳过: {file_name}")
return
# 获取camera_id和时间戳
dir_name = os.path.basename(os.path.dirname(video_path))
file_name = os.path.basename(video_path)
# 使用_get_redis_key获取正确的key
redis_key = self._get_redis_key(video_path)
if not redis_key:
print(f"无法生成Redis key,跳过处理: {file_name}")
return
# 添加视频文件检查
if not os.path.exists(video_path):
print(f"警告:视频文件不存在,跳过处理: {video_path}")
return False
if os.path.getsize(video_path) == 0:
print(f"警告:视频文件大小为0,跳过处理: {video_path}")
return False
# 处理视频
try:
with open(video_path, "rb") as f:
video_data = f.read()
try:
qwen_result = self.system.process_with_qwen(video_data, file_name, media_type='video')
except Exception as e:
print(f"处理视频内容失败,可能是损坏的视频文件: {file_name}")
print(f"错误详情: {str(e)}")
return False
# 从文件名提取时间戳
timestamp_match = re.search(r'(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})', file_name)
if timestamp_match:
year, month, day, hour, minute, second = timestamp_match.groups()
# 构建正确的时间戳格式 (YYYY-MM-DD HH:MM:SS)
timestamp = f"{year}-{month}-{day} {hour}:{minute}:{second}"
else:
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
result = {
"video_analysis": {
"qwen-7B": {
"original_answer": qwen_result["original_answer"],
"extracted_info": qwen_result["extracted_info"]
}
},
"timestamp": timestamp # 使用从文件名提取的时间戳
}
# 保存到对应的Redis数据库
if dir_name in self.redis_clients:
redis_client = self.redis_clients[dir_name]
# 获取现有的小时数据(如果存在)
existing_data = redis_client.get(redis_key)
if existing_data:
hour_results = json.loads(existing_data)
hour_results[file_name] = result
else:
hour_results = {file_name: result}
# 保存更新后的数据
json_str = json.dumps(hour_results, ensure_ascii=False)
redis_client.set(redis_key, json_str)
print(f"成功保存到Redis,使用的key: {redis_key}") # 调试信息
# 处理完成后,更新内存中的记录
self.processed_videos.add(file_name)
except Exception as e:
print(f"读取视频文件失败: {str(e)}")
return False
except Exception as e:
print(f"处理视频时发生错误 {video_path}: {str(e)}")
return False
finally:
# 确保内存清理总是执行
if torch.cuda.is_available():
try:
torch.cuda.empty_cache()
gc.collect()
except Exception as e:
print(f"清理GPU内存时发生错误: {str(e)}")
return True
def process_existing_videos(self):
"""处理目录中现有的视频文件"""
videos_found = False
videos_processed = False # 新增标志,用于跟踪是否实际处理了视频
for camera_dir in os.listdir(self.recordings_path):
camera_path = os.path.join(self.recordings_path, camera_dir)
if not os.path.isdir(camera_path):
continue
# 获取所有.avi文件并按时间排序
video_files = []
for video_file in os.listdir(camera_path):
if video_file.endswith('.avi'):
video_path = os.path.join(camera_path, video_file)
video_files.append((video_path, os.path.getmtime(video_path)))
if video_files:
videos_found = True
# 按修改时间排序
video_files.sort(key=lambda x: x[1])
for video_path, _ in video_files:
if not self._is_processed(video_path):
print(f"处理现有视频: {video_path}")
self.process_new_video(video_path)
videos_processed = True # 标记已处理视频
# 只有当找到视频并且实际处理了视频时才返回True
return videos_found and videos_processed
def monitor_directories(self):
"""监控目录变化"""
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"开始监控目录: {self.recordings_path} [{current_time}]")
while True:
try:
# 首先处理现有视频
for camera_dir in os.listdir(self.recordings_path):
camera_path = os.path.join(self.recordings_path, camera_dir)
if not os.path.isdir(camera_path):
continue
for video_file in os.listdir(camera_path):
if not video_file.endswith('.avi'):
continue
video_path = os.path.join(camera_path, video_file)
if not self._is_processed(video_path):
print(f"处理视频: {video_path}")
if not self.process_new_video(video_path):
print(f"视频处理失败,继续处理下一个: {video_path}")
continue
# 添加状态提示
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{current_time}] 等待新视频中...")
# 休眠一段时间再检查
time.sleep(120)
except Exception as e:
print(f"监控过程出错: {str(e)}")
time.sleep(30) # 出错后等待30秒再继续
def main():
try:
system = MediaAnalysisSystem()
recordings_path = "/home/zydi/VLM/recordings" # 设置recordings目录路径
# 创建并启动监控器
monitor = VideoMonitor(recordings_path, system)
monitor.monitor_directories()
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-325
View File
@@ -1,325 +0,0 @@
import os
import json
import torch
from datetime import datetime
from PIL import Image
import io
import re
from decord import VideoReader
from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
# 配置
QWEN_MODEL_PATH = "/obscura/models/qwen/Qwen2-VL-7B-Instruct"
# 初始化 Qwen 模型 (使用 cuda:0)
print("正在初始化 Qwen 模型 (cuda:0)...")
model = Qwen2VLForConditionalGeneration.from_pretrained(
QWEN_MODEL_PATH,
torch_dtype="auto",
device_map="cuda:0"
)
min_pixels = 128*28*28
max_pixels = 256*28*28
processor = AutoProcessor.from_pretrained(
QWEN_MODEL_PATH,
min_pixels=min_pixels,
max_pixels=max_pixels
)
# 在文件开头添加加载配置的代码
def load_config():
"""加载配置文件"""
try:
with open('info.json', 'r', encoding='utf-8') as f:
config = json.load(f)
return config
except Exception as e:
print(f"加载配置文件失败: {e}")
return {"actions": [], "environments": []}
# 加载配置
CONFIG = load_config()
class MediaAnalysisSystem:
def __init__(self):
self.MAX_NUM_FRAMES = 10
self.device = "cuda:0"
self.qwen_model = model
self.qwen_processor = processor
# 使用加载的配置
self.environments = CONFIG["environments"]
self.actions = CONFIG["actions"]
def encode_video(self, video_data):
def uniform_sample(l, n):
gap = len(l) / n
return [l[int(i * gap + gap / 2)] for i in range(n)]
video_file = io.BytesIO(video_data)
vr = VideoReader(video_file)
sample_fps = round(vr.get_avg_fps() / 1)
frame_idx = list(range(0, len(vr), sample_fps))
if len(frame_idx) > self.MAX_NUM_FRAMES:
frame_idx = uniform_sample(frame_idx, self.MAX_NUM_FRAMES)
frames = vr.get_batch(frame_idx).asnumpy()
frames = [Image.fromarray(v.astype('uint8')) for v in frames]
print('num frames:', len(frames))
return frames
def process_with_qwen(self, media_data, object_name, media_type='image'):
"""使用 Qwen 模型处理媒体"""
if media_type == 'video':
frames = self.encode_video(media_data)
media_content = {"type": "video", "video": frames, "fps": 1.0}
else:
image = Image.open(io.BytesIO(media_data))
media_content = {"type": "image", "image": image}
messages = [
{
"role": "user",
"content": [
media_content,
{"type": "text", "text": self._get_analysis_prompt(media_type)}
],
}
]
text = self.qwen_processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = self.qwen_processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to(self.device)
generated_ids = self.qwen_model.generate(**inputs, max_new_tokens=2048)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
answer = self.qwen_processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)[0]
return {
"model": "qwen",
"original_answer": answer,
"extracted_info": self.extract_info(answer)
}
def _get_analysis_prompt(self, media_type):
"""获取分析提示词"""
return f"""你是一位视频描述专家,你擅长对视频进行详细的描述,请对这段监控视频进行详细分析,包括以下方面,并按照下面格式回答:
1. 环境场景
- 整体场景描述(室内/室外、光线条件等)
- 主要物品和家具列表
- 环境特征(如光线、整洁度等)
2. 人员统计
- 总人数:[数字]人
- 性别分布:[男性数量]/[女性数量]
(若无法确定准确人数,请注明"无法确定人数"
3. 人员特征分析
- 个人特征:性别、年龄段、着装、体态等
- 携带物品:详细描述随身物品及用途
- 表情/情绪状态
4. 行为分析
- 个人行为:移动方向、姿态、动作等
- 互动情况:人员之间的交互描述(若多人)
- 活动区域:人员活动的主要位置
5. 群体行为(若多人)
- 聚集形态
- 移动趋势
- 群体互动特点
6. 异常情况
- 可疑行为描述
- 异常活动标记
- 需要注意的安全隐患
请用清晰、有条理的格式描述,并突出重要发现。"""
def extract_info(self, answer):
"""提取中文信息"""
info = {
"environment": None,
"num_people": None,
"actions": [],
"objects": [],
"furniture": [],
"emotions": [],
"features": []
}
# 使用加载的环境列表
for env in self.environments:
if env in answer.lower():
info["environment"] = env
break
# 中文数字模式
people_patterns = [
r'(\d+)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一|二|三|四|五|六|七|八|九|十)\s*(人|个人|位|名|员工|用户|小朋友|成年人|女性|男性)',
r'(一个|几个)\s*(人|个人|员工|用户|小朋友|成年人|女性|男性)',
r'\s*(名|位)\s*(人|员工|用户|小朋友|成年人|女性|男性)?',
r'(男|女)(性|生|士)',
r'(成年|未成年|青少年|老年)\s*(人|群体)',
r'(员工|职工|工人|学生|顾客|观众|游客|乘客)',
r'(群众|民众|大众|公众)',
r'(男女|老少|老幼|大人|小孩)'
]
for pattern in people_patterns:
match = re.search(pattern, answer)
if match:
if match.group(1).isdigit():
info["num_people"] = int(match.group(1))
elif match.group(1) in ['一个', '']:
info["num_people"] = 1
else:
num_word_to_digit = {
'': 2, '': 3, '': 4, '': 5,
'': 6, '': 7, '': 8, '': 9, '': 10
}
info["num_people"] = num_word_to_digit.get(match.group(1), 0)
break
# 使用加载的动作列表
for action in self.actions:
if action in answer:
info["actions"].append(action)
emotions = [
"钦佩", "赞赏", "欣赏","关心", "高兴", "", "乐观", "感激", "释然", "骄傲", "愉悦",
"愤怒", "烦恼", "焦虑", "尴尬", "失望", "厌恶", "恐惧", "悲伤", "懊悔", "羞耻","发呆",
"困惑", "好奇", "欲望", "惊讶", "实事求是", "中性", "赞叹","平静","放松","专注","思考",
]
objects = ["水瓶", "办公用品", "文件", "电脑","风扇","鼠标","键盘","纸巾","","","袋子","盒子","水杯","杯子","马克杯","玻璃杯","文件夹","书包","书架","文件柜","手机"]
furniture = ["椅子", "桌子", "咖啡桌", "文件柜", "", "沙发","柜子","架子","摄像头","靠垫","办公椅","电视","白板","显示器","置物架","文件架"]
features = ["戴眼镜","不戴眼镜","长发","短发","长头发","短头发","戴帽子","不戴帽子","戴口罩","不戴口罩","男性","女性","","","","","","","成年人"]
for obj in objects:
if obj in answer:
info["objects"].append(obj)
for item in furniture:
if item in answer:
info["furniture"].append(item)
for feature in features:
if feature in answer:
info["features"].append(feature)
for emotion in emotions:
if emotion in answer:
info["emotions"].append(emotion)
return info
def process_video_folder(system, folder_path, output_path=None):
"""处理文件夹中的所有视频文件并保存结果"""
valid_extensions = {'.mp4', '.avi', '.mov', '.mkv'}
results = {}
if not os.path.exists(folder_path):
raise MediaAnalysisError(f"错误:文件夹 '{folder_path}' 不存在")
if output_path is None:
output_path = os.getcwd()
elif not os.path.exists(output_path):
os.makedirs(output_path)
video_files = [
f for f in os.listdir(folder_path)
if os.path.splitext(f)[1].lower() in valid_extensions
]
if not video_files:
raise MediaAnalysisError(f"错误:在文件夹 '{folder_path}' 中未找到支持的视频文件")
print(f"\n找到 {len(video_files)} 个视频文件,开始处理...\n")
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
folder_name = os.path.basename(os.path.normpath(folder_path))
output_file = os.path.join(output_path, f"analysis_results_{folder_name}_{timestamp}.json")
for i, video_file in enumerate(video_files, 1):
video_path = os.path.join(folder_path, video_file)
print(f"正在处理 ({i}/{len(video_files)}): {video_file}")
try:
with open(video_path, "rb") as f:
video_data = f.read()
results[video_file] = {"video_analysis": {}}
# 只使用 Qwen 处理视频
print(f"使用 Qwen 处理视频: {video_file}")
qwen_result = system.process_with_qwen(video_data, video_file, media_type='video')
results[video_file]["video_analysis"]["qwen-7B"] = {
"original_answer": qwen_result["original_answer"],
"extracted_info": qwen_result["extracted_info"]
}
# 添加时间戳
results[video_file]["timestamp"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# 保存结果
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"✓ 成功处理并保存: {video_file}")
# 每个视频处理完后清理内存
if torch.cuda.is_available():
torch.cuda.empty_cache()
import gc
gc.collect()
except Exception as e:
print(f"✗ 处理失败 {video_file}: {str(e)}")
results[video_file] = {"error": str(e)}
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print(f"\n所有分析结果已保存到: {output_file}")
return results
class MediaAnalysisError(Exception):
"""自定义媒体分析异常类"""
pass
def main():
try:
system = MediaAnalysisSystem()
# 添加文件夹路径输入处理
folder_path = input("请输入视频文件夹路径: ").strip()
output_path = input("请输入结果保存路径 (直接回车使用当前目录): ").strip()
# 如果用户没有输入输出路径,则使用None(将使用当前目录)
output_path = output_path if output_path else None
# 处理文件夹中的视频
results = process_video_folder(system, folder_path, output_path)
# 显示处理统计
success_count = sum(1 for r in results.values() if "error" not in r)
print(f"\n处理完成!成功: {success_count}/{len(results)}")
except MediaAnalysisError as e:
print(f"\n错误: {str(e)}")
except Exception as e:
print(f"\n未预期的错误: {str(e)}")
if __name__ == "__main__":
main()
-45
View File
@@ -1,45 +0,0 @@
# VLM模型测评总结
## 测试模型:qwen-vl2-7Bqwen-vl2-2Bminicpmllama3.2-vision,deepseek-vl2
其中,qwen-vl2-7Bqwen-vl2-2Bminicpm可直接输入视频,视频为10s短视频
llama3.2-vision,deepseek-vl2输入图片,将10s短视频截取为3-8张图片
## 测试数据包括:
1. 室内右上角全景
2. 沙发正面
3. 右上角-吃饭
4. 左前方-吃饭
4. 左侧
## 测试结果
1. 在处于近景时(左前方、正前方),qwen-vl2-7Bqwen-vl2-2Bminicpmllama3.2-vision,deepseek-vl2均能较好地识别出场景中的人数,并能较好地描述场景,包括吃饭、睡觉、喝水、玩手机等动作。对于人物的识别都差不多,
表情难以识别,穿着识别较准,性别和外貌一般。
1. minicpm会推测是什么食物,只有llama3.2-vision和minicpm识别出食物是玉米,但概率比较小,不是每次都能识别出来。
2. qwen-2B效果最差,无法识别出场景中的人数,也无法识别出场景中的人物
3. llama3.2-vision对场景描述很详细,只有llama3.2-vision识别到了电视
2. 在处于远景时(右上角),动作是吃饭,只有llama3.2-vision识别出吃东西和喝水的动作,但对人数识别不准确,其他模型都认为在工作
3. 在处于远景时(右上角),人物在室内活动,此时的行为包括
1. 搬纸箱/整理东西:只有qwen-7B准确识别行为和人数,以及人物外貌
2. 站立喝水:"qwen-7B"和llama3.2-vision识别出喝水的动作,其余模型会把水杯识别成手机,"qwen-7B"对人数和外貌识别最准,llama3.2-vision对人数及外貌的识别很离谱,
3. 坐着使用手机:"deepseek-vl2"有时不能,llama3.2-vision对人数及外貌的识别最差
4. 坐着看书:在第一个视频"minicpm"、"qwen-vl2-7B"、llama3.2-vision能肯定是一个人坐着看书。在另一个视频deepseek认为在逗猫,qwen-vl2-7B认为在弹吉他,llama3.2-vision对人数及外貌的识别最离谱
5. 蹲着看书:都认为是坐着看书,只有minicpm识别到"蹲"这个动作,
6. 行走、站在窗前:"qwen-7B"和"deepseek-vl2"对人数和行为以及行动轨迹的识别最准,其他模型能识别到在行走和行动轨迹,但在人数判断上存在问题
7. 移动电视:qwen-vl2-7B准确识别这一行为,其他模型只能识别出弯腰检查东西
针对远景来看,qwen-7B对行为的识别最准,但对于人数的判断最差,minicpm对人数及外貌的判断相对最准确
4. 近景左侧时,行为包括
1. 看书:都可以识别到
2. 使用手机:都可以,llama3.2-vision对人数及外貌的识别还是不行
3. 使用电脑:都可以识别到
4. 喝水:"qwen-7B"、minicpm、deepseek-vl2、llama3.2-vision都可以识别到喝水
5. 在白板上写字:只有minicpm识别到
6. 行走:都能识别,但llama3.2-vision对人数及外貌识别差距很大
7. 先使用手机,然后拿起书并开始阅读:都可以识别到这个动作
## 总结
1. 在近景时,几个模型对动作、人物表情、人物外貌的识别都较准确,只有llama3.2-vision对人数以及人物识别最差
2. 在远景时,qwen-7B对行为的识别相对较准,minicpm对人数及外貌的判断相对最准确。
3. 无论是近景还是远景,llama3.2-vision对人数以及人物识别都较差,会出现很多人物,对人物的描述也千奇百怪,但对行为识别还可以
4. qwen-2B表现最差
5. deepseek-vl2一般
-1330
View File
File diff suppressed because it is too large Load Diff
-1503
View File
File diff suppressed because it is too large Load Diff
-1976
View File
File diff suppressed because it is too large Load Diff
-2122
View File
File diff suppressed because it is too large Load Diff
-2191
View File
File diff suppressed because it is too large Load Diff
-2348
View File
File diff suppressed because it is too large Load Diff
-2358
View File
File diff suppressed because it is too large Load Diff
-1148
View File
File diff suppressed because it is too large Load Diff