You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

60 lines
1.7 KiB

2 weeks ago
from openai import OpenAI
2 weeks ago
from Config.Config import *
2 weeks ago
2 weeks ago
# 一、调用OCR整理出试题
2 weeks ago
client = OpenAI(
2 weeks ago
api_key=LLM_API_KEY,
base_url=LLM_BASE_URL,
2 weeks ago
)
prompt = "请提取图片中的试题"
completion = client.chat.completions.create(
model="qwen-vl-ocr-latest",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
2 weeks ago
"image_url": "https://ylt.oss-cn-hangzhou.aliyuncs.com/HuangHai/Test/SourceWithPhoto.jpg",
2 weeks ago
"min_pixels": 28 * 28 * 4,
"max_pixels": 28 * 28 * 8192
},
2 weeks ago
{"type": "text", "text": prompt},
2 weeks ago
]
}
])
2 weeks ago
ocr_text = completion.choices[0].message.content
2 weeks ago
2 weeks ago
prompt = """
我将提供一份markdown格式的试卷请帮我整理出每道题的以下内容
1. 题目序号
2. 题目内容自动识别并添加$$$包裹数学公式
3. 选项如果有
4. 答案
5. 解析
要求
- 一道题一道题输出不要使用表格
- 自动检测数学表达式并用$$$正确包裹
- 确保公式中的特殊字符正确转义
- 除题目内容外不要输出其它无关信息
内容如下
"""
prompt += ocr_text
completion = client.chat.completions.create(
model="deepseek-v3",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user",
"content": prompt},
],
)
print(completion.choices[0].message.content)
with open("../output/数学OCR整理后的结果.md", "w", encoding="utf-8") as f:
f.write(completion.choices[0].message.content)
2 weeks ago
print("保存成功!")