Initial commit with Llama API client and docs

Add Python scripts for Llama API chat clients, endpoint testing, and quick tests. Include documentation (README, CONTRIBUTING, 操作指南), license, and .gitignore. Supports multiple endpoints and models for OpenAI-compatible Llama API usage.
This commit is contained in:
2025-09-19 21:44:02 +08:00
parent 4e28c131d2
commit 8a929936ad
18 changed files with 2073 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
{
"permissions": {
"allow": [
"Bash(pip install:*)",
"Bash(python:*)",
"Bash(ping:*)",
"Bash(curl:*)",
"Bash(dir)",
"Bash(git init:*)",
"Bash(git add:*)",
"Bash(git commit:*)"
],
"defaultMode": "acceptEdits"
}
}

102
.gitignore vendored Normal file
View File

@@ -0,0 +1,102 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Virtual environments
venv/
ENV/
env/
.venv/
.env
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
# Project specific
*.log
*.tmp
temp/
tmp/
logs/
output/
# API keys and secrets (if stored in separate config)
config.ini
secrets.json
.env.local
.env.production
# Test outputs
test_results/
*.test.txt
# Backup files
*.bak
*.backup
*.old
# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
# macOS
.DS_Store
.AppleDouble
.LSOverride
# Linux
.directory
.Trash-*

196
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,196 @@
# Contributing to Llama API Client
Thank you for your interest in contributing to Llama API Client! This document provides guidelines for contributing to the project.
## How to Contribute
### Reporting Bugs
Before creating bug reports, please check existing issues to avoid duplicates. When creating a bug report, include:
- A clear and descriptive title
- Steps to reproduce the issue
- Expected behavior
- Actual behavior
- System information (OS, Python version, etc.)
- Error messages or logs
### Suggesting Enhancements
Enhancement suggestions are welcome! Please provide:
- A clear and descriptive title
- Detailed description of the proposed feature
- Use cases and benefits
- Possible implementation approach
### Pull Requests
1. **Fork the repository** and create your branch from `main`
2. **Follow the coding style** used in the project
3. **Write clear commit messages**
4. **Add tests** if applicable
5. **Update documentation** if needed
6. **Test your changes** thoroughly
## Development Setup
```bash
# Clone your fork
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run tests
python quick_test.py
```
## Coding Standards
### Python Style Guide
- Follow PEP 8
- Use meaningful variable names
- Add docstrings to functions and classes
- Keep functions focused and small
- Handle exceptions appropriately
### Example Code Style
```python
def clean_response(text: str) -> str:
"""
Clean AI response by removing special markers.
Args:
text: Raw response text from AI
Returns:
Cleaned text without special markers
"""
# Implementation here
return cleaned_text
```
### Commit Message Format
Use clear and descriptive commit messages:
- `feat:` New feature
- `fix:` Bug fix
- `docs:` Documentation changes
- `style:` Code style changes
- `refactor:` Code refactoring
- `test:` Test additions or changes
- `chore:` Maintenance tasks
Examples:
```
feat: Add support for new model endpoint
fix: Handle encoding errors in Windows terminals
docs: Update README with troubleshooting section
```
## Testing
### Running Tests
```bash
# Quick connection test
python quick_test.py
# Test all models
python test_all_models.py
# Test specific endpoint
python local_api_test.py
```
### Writing Tests
When adding new features, include appropriate tests:
```python
def test_endpoint_connection():
"""Test if endpoint is reachable"""
assert test_endpoint({"url": "...", "models": ["..."]})
```
## Documentation
- Update README.md for user-facing changes
- Update 操作指南.md for Chinese documentation
- Add docstrings to all public functions
- Include usage examples for new features
## Code Review Process
1. All submissions require review before merging
2. Reviews focus on:
- Code quality and style
- Test coverage
- Documentation completeness
- Performance implications
- Security considerations
## Areas for Contribution
### Current Needs
- [ ] Add retry logic for failed connections
- [ ] Implement connection pooling
- [ ] Add streaming response support
- [ ] Create GUI interface
- [ ] Add conversation export/import
- [ ] Implement rate limiting
- [ ] Add proxy support
- [ ] Create Docker container
- [ ] Add more language examples
- [ ] Improve error messages
### Future Features
- Web interface
- Mobile app support
- Voice input/output
- Multi-user support
- Analytics dashboard
- Plugin system
## Community
### Communication Channels
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: General questions and discussions
- Pull Requests: Code contributions
### Code of Conduct
- Be respectful and inclusive
- Welcome newcomers
- Provide constructive feedback
- Focus on what is best for the community
- Show empathy towards others
## Questions?
If you have questions about contributing, feel free to:
1. Open an issue with the `question` label
2. Check existing documentation
3. Review closed issues for similar questions
## License
By contributing, you agree that your contributions will be licensed under the MIT License.
---
Thank you for contributing to Llama API Client! 🚀

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 Llama API Client Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

201
README.md Normal file
View File

@@ -0,0 +1,201 @@
# Llama API Client
A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.
## Features
- 🌐 Support for both internal network and external API endpoints
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
- 💬 Interactive chat interface with conversation history
- 🔄 Automatic endpoint testing and failover
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
- 📝 Full conversation context management
## Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client
# Install dependencies
pip install -r requirements.txt
```
### Basic Usage
```python
from openai import OpenAI
# Configure API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1"
# Create client
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
# Send request
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "Hello!"}],
temperature=0.7,
max_tokens=200
)
print(response.choices[0].message.content)
```
### Run Interactive Chat
```bash
# Full-featured chat with all endpoints
python llama_full_api.py
# Internal network only
python llama_chat.py
# Quick test
python quick_test.py
```
## Available Endpoints
### Internal Network (Tested & Working ✅)
| Endpoint | URL | Status |
|----------|-----|--------|
| Internal 1 | `http://192.168.0.6:21180/v1` | ✅ Working |
| Internal 2 | `http://192.168.0.6:21181/v1` | ✅ Working |
| Internal 3 | `http://192.168.0.6:21182/v1` | ✅ Working |
| Internal 4 | `http://192.168.0.6:21183/v1` | ❌ Error 500 |
### External Network
| Endpoint | URL | Status |
|----------|-----|--------|
| GPT-OSS | `https://llama.theaken.com/v1/gpt-oss-120b` | 🔄 Pending |
| DeepSeek | `https://llama.theaken.com/v1/deepseek-r1-671b` | 🔄 Pending |
| General | `https://llama.theaken.com/v1` | 🔄 Pending |
## Project Structure
```
llama-api-client/
├── README.md # This file
├── requirements.txt # Python dependencies
├── 操作指南.md # Chinese operation guide
├── llama_full_api.py # Full-featured chat client
├── llama_chat.py # Internal network chat client
├── local_api_test.py # Endpoint testing tool
├── quick_test.py # Quick connection test
├── test_all_models.py # Model testing script
└── demo_chat.py # Demo chat with fallback
```
## Chat Commands
During chat sessions, you can use these commands:
- `exit` or `quit` - End the conversation
- `clear` - Clear conversation history
- `model` - Switch between available models
## Configuration
### API Key
```python
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
```
### Available Models
- `gpt-oss-120b` - GPT Open Source 120B parameters
- `deepseek-r1-671b` - DeepSeek R1 671B parameters
- `qwen3-embedding-8b` - Qwen3 Embedding 8B parameters
## Troubleshooting
### Issue: 502 Bad Gateway
**Cause**: External API server is offline
**Solution**: Use internal network endpoints
### Issue: Connection Error
**Cause**: Not on internal network or incorrect IP
**Solution**:
1. Verify network connectivity: `ping 192.168.0.6`
2. Check firewall settings
3. Ensure you're on the same network
### Issue: Encoding Error
**Cause**: Windows terminal encoding issues
**Solution**: Use English for conversations or modify terminal encoding
### Issue: Response Contains Special Markers
**Description**: Responses may contain `<think>`, `<|channel|>` tags
**Solution**: The client automatically removes these markers
## Response Cleaning
The client automatically removes these special markers from AI responses:
- `<think>...</think>` - Thinking process
- `<|channel|>...<|message|>` - Channel markers
- `<|end|>`, `<|start|>` - End/start markers
## Requirements
- Python 3.7+
- openai>=1.0.0
- requests (optional, for direct API calls)
## Development
### Testing Connection
```python
python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
```
### Adding New Endpoints
Edit `ENDPOINTS` dictionary in `llama_full_api.py`:
```python
ENDPOINTS = {
"internal": [
{
"name": "New Endpoint",
"url": "http://new-endpoint/v1",
"models": ["gpt-oss-120b"]
}
]
}
```
## License
MIT License - See LICENSE file for details
## Contributing
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## Support
For issues or questions:
1. Check the [操作指南.md](操作指南.md) for detailed Chinese documentation
2. Open an issue on GitHub
3. Contact the API administrator for server-related issues
## Acknowledgments
- Built with OpenAI Python SDK
- Compatible with OpenAI API format
- Supports multiple Llama model variants
---
**Last Updated**: 2025-09-19
**Version**: 1.0.0
**Status**: Internal endpoints working, external endpoints pending

124
demo_chat.py Normal file
View File

@@ -0,0 +1,124 @@
"""
Llama API 對話程式 (示範版本)
當 API 伺服器恢復後,可以使用此程式進行對話
"""
from openai import OpenAI
import time
# API 設定
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "https://llama.theaken.com/v1"
def simulate_chat():
"""模擬對話功能(用於展示)"""
print("\n" + "="*50)
print("Llama AI 對話系統 - 示範模式")
print("="*50)
print("\n[注意] API 伺服器目前離線,以下為模擬對話")
print("當伺服器恢復後,將自動連接真實 API\n")
# 模擬回應
demo_responses = [
"你好!我是 Llama AI 助手,很高興為你服務。",
"這是一個示範回應。當 API 伺服器恢復後,你將收到真實的 AI 回應。",
"我可以回答問題、協助編程、翻譯文字等多種任務。",
"請問有什麼我可以幫助你的嗎?"
]
response_index = 0
print("輸入 'exit' 結束對話\n")
while True:
user_input = input("你: ").strip()
if user_input.lower() in ['exit', 'quit']:
print("\n再見!")
break
if not user_input:
continue
# 模擬思考時間
print("\nAI 思考中", end="")
for _ in range(3):
time.sleep(0.3)
print(".", end="", flush=True)
print()
# 顯示模擬回應
print(f"\nAI: {demo_responses[response_index % len(demo_responses)]}")
response_index += 1
def real_chat():
"""實際對話功能(當 API 可用時)"""
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
print("\n" + "="*50)
print("Llama AI 對話系統")
print("="*50)
print("\n已連接到 Llama API")
print("輸入 'exit' 結束對話\n")
messages = []
while True:
user_input = input("你: ").strip()
if user_input.lower() in ['exit', 'quit']:
print("\n再見!")
break
if not user_input:
continue
messages.append({"role": "user", "content": user_input})
try:
print("\nAI 思考中...")
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=messages,
temperature=0.7,
max_tokens=1000
)
ai_response = response.choices[0].message.content
print(f"\nAI: {ai_response}")
messages.append({"role": "assistant", "content": ai_response})
except Exception as e:
print(f"\n[錯誤] {str(e)[:100]}")
print("無法取得回應,請稍後再試")
def main():
print("檢查 API 連接狀態...")
# 嘗試連接 API
try:
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
# 快速測試
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "test"}],
max_tokens=10,
timeout=5
)
print("[成功] API 已連接")
real_chat()
except Exception as e:
error_msg = str(e)
if "502" in error_msg or "Bad gateway" in error_msg:
print("[提示] API 伺服器目前離線 (502 錯誤)")
print("進入示範模式...")
simulate_chat()
else:
print(f"[錯誤] 無法連接: {error_msg[:100]}")
print("\n是否要進入示範模式? (y/n): ", end="")
if input().lower() == 'y':
simulate_chat()
if __name__ == "__main__":
main()

196
llama_chat.py Normal file
View File

@@ -0,0 +1,196 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Llama 內網 API 對話程式
支援多個端點和模型選擇
"""
from openai import OpenAI
import sys
import re
# API 配置
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
# 可用端點 (前 3 個已測試可用)
ENDPOINTS = [
"http://192.168.0.6:21180/v1",
"http://192.168.0.6:21181/v1",
"http://192.168.0.6:21182/v1",
"http://192.168.0.6:21183/v1"
]
# 模型列表
MODELS = [
"gpt-oss-120b",
"deepseek-r1-671b",
"qwen3-embedding-8b"
]
def clean_response(text):
"""清理 AI 回應中的特殊標記"""
# 移除思考標記
if "<think>" in text:
text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
# 移除 channel 標記
if "<|channel|>" in text:
parts = text.split("<|message|>")
if len(parts) > 1:
text = parts[-1]
# 移除結束標記
text = text.replace("<|end|>", "").replace("<|start|>", "")
# 清理多餘空白
text = text.strip()
return text
def test_endpoint(endpoint):
"""測試端點是否可用"""
try:
client = OpenAI(api_key=API_KEY, base_url=endpoint)
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "Hi"}],
max_tokens=10,
timeout=5
)
return True
except:
return False
def chat_session(endpoint, model):
"""對話主程式"""
print("\n" + "="*60)
print("Llama AI 對話系統")
print("="*60)
print(f"端點: {endpoint}")
print(f"模型: {model}")
print("\n指令:")
print(" exit/quit - 結束對話")
print(" clear - 清空對話歷史")
print(" model - 切換模型")
print("-"*60)
client = OpenAI(api_key=API_KEY, base_url=endpoint)
messages = []
while True:
try:
user_input = input("\n你: ").strip()
if not user_input:
continue
if user_input.lower() in ['exit', 'quit']:
print("再見!")
break
if user_input.lower() == 'clear':
messages = []
print("[系統] 對話歷史已清空")
continue
if user_input.lower() == 'model':
print("\n可用模型:")
for i, m in enumerate(MODELS, 1):
print(f" {i}. {m}")
choice = input("選擇 (1-3): ").strip()
if choice in ['1', '2', '3']:
model = MODELS[int(choice)-1]
print(f"[系統] 已切換到 {model}")
continue
messages.append({"role": "user", "content": user_input})
print("\nAI 思考中...", end="", flush=True)
try:
response = client.chat.completions.create(
model=model,
messages=messages,
temperature=0.7,
max_tokens=1000
)
ai_response = response.choices[0].message.content
ai_response = clean_response(ai_response)
print("\r" + " "*20 + "\r", end="") # 清除 "思考中..."
print(f"AI: {ai_response}")
messages.append({"role": "assistant", "content": ai_response})
except UnicodeEncodeError:
print("\r[錯誤] 編碼問題,請使用英文對話")
messages.pop() # 移除最後的用戶訊息
except Exception as e:
print(f"\r[錯誤] {str(e)[:100]}")
messages.pop() # 移除最後的用戶訊息
except KeyboardInterrupt:
print("\n\n[中斷] 使用 exit 命令正常退出")
continue
except EOFError:
print("\n再見!")
break
def main():
print("="*60)
print("Llama 內網 API 對話程式")
print("="*60)
# 測試端點
print("\n正在檢查可用端點...")
available = []
for i, endpoint in enumerate(ENDPOINTS[:3], 1): # 只測試前3個
print(f" 測試 {endpoint}...", end="", flush=True)
if test_endpoint(endpoint):
print(" [OK]")
available.append(endpoint)
else:
print(" [失敗]")
if not available:
print("\n[錯誤] 沒有可用的端點")
sys.exit(1)
# 選擇端點
if len(available) == 1:
selected_endpoint = available[0]
print(f"\n使用端點: {selected_endpoint}")
else:
print(f"\n找到 {len(available)} 個可用端點:")
for i, ep in enumerate(available, 1):
print(f" {i}. {ep}")
print("\n選擇端點 (預設: 1): ", end="")
choice = input().strip()
if choice and choice.isdigit() and 1 <= int(choice) <= len(available):
selected_endpoint = available[int(choice)-1]
else:
selected_endpoint = available[0]
# 選擇模型
print("\n可用模型:")
for i, model in enumerate(MODELS, 1):
print(f" {i}. {model}")
print("\n選擇模型 (預設: 1): ", end="")
choice = input().strip()
if choice in ['1', '2', '3']:
selected_model = MODELS[int(choice)-1]
else:
selected_model = MODELS[0]
# 開始對話
chat_session(selected_endpoint, selected_model)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\n程式已退出")
except Exception as e:
print(f"\n[錯誤] {e}")
sys.exit(1)

293
llama_full_api.py Normal file
View File

@@ -0,0 +1,293 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Llama API 完整對話程式
支援內網和外網端點
"""
from openai import OpenAI
import requests
import sys
import re
from datetime import datetime
# API 金鑰
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
# API 端點配置
ENDPOINTS = {
"內網": [
{
"name": "內網端點 1 (21180)",
"url": "http://192.168.0.6:21180/v1",
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
},
{
"name": "內網端點 2 (21181)",
"url": "http://192.168.0.6:21181/v1",
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
},
{
"name": "內網端點 3 (21182)",
"url": "http://192.168.0.6:21182/v1",
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
}
],
"外網": [
{
"name": "外網 GPT-OSS-120B",
"url": "https://llama.theaken.com/v1/gpt-oss-120b",
"models": ["gpt-oss-120b"]
},
{
"name": "外網 DeepSeek-R1-671B",
"url": "https://llama.theaken.com/v1/deepseek-r1-671b",
"models": ["deepseek-r1-671b"]
},
{
"name": "外網通用端點",
"url": "https://llama.theaken.com/v1",
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
}
]
}
def clean_response(text):
"""清理 AI 回應中的特殊標記"""
# 移除思考標記
if "<think>" in text:
text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
# 移除 channel 標記
if "<|channel|>" in text:
parts = text.split("<|message|>")
if len(parts) > 1:
text = parts[-1]
# 移除結束標記
text = text.replace("<|end|>", "").replace("<|start|>", "")
# 清理多餘空白
text = text.strip()
return text
def test_endpoint(endpoint_info):
"""測試端點是否可用"""
url = endpoint_info["url"]
model = endpoint_info["models"][0] # 使用第一個模型測試
try:
# 對於特定模型的 URL需要特殊處理
if "/gpt-oss-120b" in url or "/deepseek-r1-671b" in url:
# 這些可能是特定模型的端點
base_url = url.rsplit("/", 1)[0] # 移除模型名稱部分
else:
base_url = url
client = OpenAI(api_key=API_KEY, base_url=base_url)
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "test"}],
max_tokens=5,
timeout=8
)
return True
except Exception as e:
# 也嘗試使用 requests 直接測試
try:
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
test_url = f"{url}/chat/completions" if not url.endswith("/chat/completions") else url
data = {
"model": model,
"messages": [{"role": "user", "content": "test"}],
"max_tokens": 5
}
response = requests.post(test_url, headers=headers, json=data, timeout=8)
return response.status_code == 200
except:
return False
def test_all_endpoints():
"""測試所有端點"""
print("\n" + "="*60)
print("測試 API 端點連接")
print("="*60)
available_endpoints = []
# 測試內網端點
print("\n[內網端點測試]")
for endpoint in ENDPOINTS["內網"]:
print(f" 測試 {endpoint['name']}...", end="", flush=True)
if test_endpoint(endpoint):
print(" [OK]")
available_endpoints.append(("內網", endpoint))
else:
print(" [FAIL]")
# 測試外網端點
print("\n[外網端點測試]")
for endpoint in ENDPOINTS["外網"]:
print(f" 測試 {endpoint['name']}...", end="", flush=True)
if test_endpoint(endpoint):
print(" [OK]")
available_endpoints.append(("外網", endpoint))
else:
print(" [FAIL]")
return available_endpoints
def chat_session(endpoint_info):
"""對話主程式"""
print("\n" + "="*60)
print("Llama AI 對話系統")
print("="*60)
print(f"端點: {endpoint_info['name']}")
print(f"URL: {endpoint_info['url']}")
print(f"可用模型: {', '.join(endpoint_info['models'])}")
print("\n指令:")
print(" exit/quit - 結束對話")
print(" clear - 清空對話歷史")
print(" model - 切換模型")
print("-"*60)
# 處理 URL
url = endpoint_info["url"]
if "/gpt-oss-120b" in url or "/deepseek-r1-671b" in url:
base_url = url.rsplit("/", 1)[0]
else:
base_url = url
client = OpenAI(api_key=API_KEY, base_url=base_url)
# 選擇初始模型
if len(endpoint_info['models']) == 1:
current_model = endpoint_info['models'][0]
else:
print("\n選擇模型:")
for i, model in enumerate(endpoint_info['models'], 1):
print(f" {i}. {model}")
choice = input("選擇 (預設: 1): ").strip()
if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
current_model = endpoint_info['models'][int(choice)-1]
else:
current_model = endpoint_info['models'][0]
print(f"\n使用模型: {current_model}")
messages = []
while True:
try:
user_input = input("\n你: ").strip()
if not user_input:
continue
if user_input.lower() in ['exit', 'quit']:
print("再見!")
break
if user_input.lower() == 'clear':
messages = []
print("[系統] 對話歷史已清空")
continue
if user_input.lower() == 'model':
if len(endpoint_info['models']) == 1:
print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
else:
print("\n可用模型:")
for i, m in enumerate(endpoint_info['models'], 1):
print(f" {i}. {m}")
choice = input("選擇: ").strip()
if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
current_model = endpoint_info['models'][int(choice)-1]
print(f"[系統] 已切換到 {current_model}")
continue
messages.append({"role": "user", "content": user_input})
print("\nAI 思考中...", end="", flush=True)
try:
response = client.chat.completions.create(
model=current_model,
messages=messages,
temperature=0.7,
max_tokens=1000
)
ai_response = response.choices[0].message.content
ai_response = clean_response(ai_response)
print("\r" + " "*20 + "\r", end="")
print(f"AI: {ai_response}")
messages.append({"role": "assistant", "content": ai_response})
except Exception as e:
print(f"\r[錯誤] {str(e)[:100]}")
messages.pop()
except KeyboardInterrupt:
print("\n\n[中斷] 使用 exit 命令正常退出")
continue
except EOFError:
print("\n再見!")
break
def main():
print("="*60)
print("Llama API 完整對話程式")
print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*60)
# 測試所有端點
available = test_all_endpoints()
if not available:
print("\n[錯誤] 沒有可用的端點")
print("\n可能的原因:")
print("1. 網路連接問題")
print("2. API 服務離線")
print("3. 防火牆阻擋")
sys.exit(1)
# 顯示可用端點
print("\n" + "="*60)
print(f"找到 {len(available)} 個可用端點:")
print("="*60)
for i, (network_type, endpoint) in enumerate(available, 1):
print(f"{i}. [{network_type}] {endpoint['name']}")
print(f" URL: {endpoint['url']}")
print(f" 模型: {', '.join(endpoint['models'])}")
# 選擇端點
print("\n選擇端點 (預設: 1): ", end="")
choice = input().strip()
if choice.isdigit() and 1 <= int(choice) <= len(available):
selected = available[int(choice)-1][1]
else:
selected = available[0][1]
# 開始對話
chat_session(selected)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\n程式已退出")
except Exception as e:
print(f"\n[錯誤] {e}")
import traceback
traceback.print_exc()
sys.exit(1)

99
llama_test.py Normal file
View File

@@ -0,0 +1,99 @@
from openai import OpenAI
import sys
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "https://llama.theaken.com/v1"
AVAILABLE_MODELS = [
"gpt-oss-120b",
"deepseek-r1-671b",
"qwen3-embedding-8b"
]
def chat_with_llama(model_name="gpt-oss-120b"):
client = OpenAI(
api_key=API_KEY,
base_url=BASE_URL
)
print(f"\n使用模型: {model_name}")
print("-" * 50)
print("輸入 'exit''quit' 來結束對話")
print("-" * 50)
messages = []
while True:
user_input = input("\n你: ").strip()
if user_input.lower() in ['exit', 'quit']:
print("對話結束")
break
if not user_input:
continue
messages.append({"role": "user", "content": user_input})
try:
response = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.7,
max_tokens=2000
)
assistant_reply = response.choices[0].message.content
print(f"\nAI: {assistant_reply}")
messages.append({"role": "assistant", "content": assistant_reply})
except Exception as e:
print(f"\n錯誤: {str(e)}")
print("請檢查網路連接和 API 設定")
def test_connection():
print("測試連接到 Llama API...")
client = OpenAI(
api_key=API_KEY,
base_url=BASE_URL
)
try:
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "Hello, this is a test message."}],
max_tokens=50
)
print("[OK] 連接成功!")
print(f"測試回應: {response.choices[0].message.content}")
return True
except Exception as e:
print(f"[ERROR] 連接失敗: {str(e)[:200]}")
return False
def main():
print("=" * 50)
print("Llama 模型對話測試程式")
print("=" * 50)
print("\n可用的模型:")
for i, model in enumerate(AVAILABLE_MODELS, 1):
print(f" {i}. {model}")
if test_connection():
print("\n選擇要使用的模型 (輸入數字 1-3預設: 1):")
choice = input().strip()
if choice == "2":
model = AVAILABLE_MODELS[1]
elif choice == "3":
model = AVAILABLE_MODELS[2]
else:
model = AVAILABLE_MODELS[0]
chat_with_llama(model)
if __name__ == "__main__":
main()

243
local_api_test.py Normal file
View File

@@ -0,0 +1,243 @@
"""
內網 Llama API 測試程式
使用 OpenAI 相容格式連接到本地 API 端點
"""
from openai import OpenAI
import requests
import json
from datetime import datetime
# API 配置
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
# 內網端點列表
LOCAL_ENDPOINTS = [
"http://192.168.0.6:21180/v1",
"http://192.168.0.6:21181/v1",
"http://192.168.0.6:21182/v1",
"http://192.168.0.6:21183/v1"
]
# 可用模型
MODELS = [
"gpt-oss-120b",
"deepseek-r1-671b",
"qwen3-embedding-8b"
]
def test_endpoint_with_requests(endpoint, model="gpt-oss-120b"):
"""使用 requests 測試端點"""
print(f"\n[使用 requests 測試]")
print(f"端點: {endpoint}")
print(f"模型: {model}")
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
data = {
"model": model,
"messages": [
{"role": "user", "content": "Say 'Hello, I am working!' if you can see this."}
],
"temperature": 0.7,
"max_tokens": 50
}
try:
response = requests.post(
f"{endpoint}/chat/completions",
headers=headers,
json=data,
timeout=10
)
print(f"HTTP 狀態碼: {response.status_code}")
if response.status_code == 200:
result = response.json()
if 'choices' in result:
content = result['choices'][0]['message']['content']
print(f"[SUCCESS] AI 回應: {content}")
return True
else:
print("[ERROR] 回應格式不正確")
else:
print(f"[ERROR] HTTP {response.status_code}")
if response.status_code != 502: # 避免顯示 HTML 錯誤頁
print(f"詳情: {response.text[:200]}")
except requests.exceptions.ConnectTimeout:
print("[TIMEOUT] 連接超時")
except requests.exceptions.ConnectionError:
print("[CONNECTION ERROR] 無法連接到端點")
except Exception as e:
print(f"[ERROR] {str(e)[:100]}")
return False
def test_endpoint_with_openai(endpoint, model="gpt-oss-120b"):
"""使用 OpenAI SDK 測試端點"""
print(f"\n[使用 OpenAI SDK 測試]")
print(f"端點: {endpoint}")
print(f"模型: {model}")
try:
client = OpenAI(
api_key=API_KEY,
base_url=endpoint,
timeout=10.0
)
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "Hello, please respond with a simple greeting."}
],
temperature=0.7,
max_tokens=50
)
content = response.choices[0].message.content
print(f"[SUCCESS] AI 回應: {content}")
return True, client
except Exception as e:
error_str = str(e)
if "Connection error" in error_str:
print("[CONNECTION ERROR] 無法連接到端點")
elif "timeout" in error_str.lower():
print("[TIMEOUT] 請求超時")
elif "502" in error_str:
print("[ERROR] 502 Bad Gateway")
else:
print(f"[ERROR] {error_str[:100]}")
return False, None
def find_working_endpoint():
"""尋找可用的端點"""
print("="*60)
print(f"內網 API 端點測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*60)
working_endpoints = []
for endpoint in LOCAL_ENDPOINTS:
print(f"\n測試端點: {endpoint}")
print("-"*40)
# 先用 requests 快速測試
if test_endpoint_with_requests(endpoint):
working_endpoints.append(endpoint)
print(f"[OK] 端點 {endpoint} 可用!")
else:
# 再用 OpenAI SDK 測試
success, _ = test_endpoint_with_openai(endpoint)
if success:
working_endpoints.append(endpoint)
print(f"[OK] 端點 {endpoint} 可用!")
return working_endpoints
def interactive_chat(endpoint, model="gpt-oss-120b"):
"""互動式對話"""
print(f"\n連接到: {endpoint}")
print(f"使用模型: {model}")
print("="*60)
print("開始對話 (輸入 'exit' 結束)")
print("="*60)
client = OpenAI(
api_key=API_KEY,
base_url=endpoint
)
messages = []
while True:
user_input = input("\n你: ").strip()
if user_input.lower() in ['exit', 'quit']:
print("對話結束")
break
if not user_input:
continue
messages.append({"role": "user", "content": user_input})
try:
print("\nAI 思考中...")
response = client.chat.completions.create(
model=model,
messages=messages,
temperature=0.7,
max_tokens=1000
)
ai_response = response.choices[0].message.content
print(f"\nAI: {ai_response}")
messages.append({"role": "assistant", "content": ai_response})
except Exception as e:
print(f"\n[ERROR] {str(e)[:100]}")
def main():
# 尋找可用端點
working_endpoints = find_working_endpoint()
print("\n" + "="*60)
print("測試結果總結")
print("="*60)
if working_endpoints:
print(f"\n找到 {len(working_endpoints)} 個可用端點:")
for i, endpoint in enumerate(working_endpoints, 1):
print(f" {i}. {endpoint}")
# 選擇端點
if len(working_endpoints) == 1:
selected_endpoint = working_endpoints[0]
print(f"\n自動選擇唯一可用端點: {selected_endpoint}")
else:
print(f"\n請選擇要使用的端點 (1-{len(working_endpoints)}):")
choice = input().strip()
try:
idx = int(choice) - 1
if 0 <= idx < len(working_endpoints):
selected_endpoint = working_endpoints[idx]
else:
selected_endpoint = working_endpoints[0]
except:
selected_endpoint = working_endpoints[0]
# 選擇模型
print("\n可用模型:")
for i, model in enumerate(MODELS, 1):
print(f" {i}. {model}")
print("\n請選擇模型 (1-3, 預設: 1):")
choice = input().strip()
if choice == "2":
selected_model = MODELS[1]
elif choice == "3":
selected_model = MODELS[2]
else:
selected_model = MODELS[0]
# 開始對話
interactive_chat(selected_endpoint, selected_model)
else:
print("\n[ERROR] 沒有找到可用的端點")
print("\n可能的原因:")
print("1. 內網 API 服務未啟動")
print("2. 防火牆阻擋了連接")
print("3. IP 地址或端口設定錯誤")
print("4. 不在同一個網路環境")
if __name__ == "__main__":
main()

54
quick_test.py Normal file
View File

@@ -0,0 +1,54 @@
"""
快速測試內網 Llama API
"""
from openai import OpenAI
# API 設定
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1" # 使用第一個可用端點
def quick_test():
print("連接到內網 API...")
print(f"端點: {BASE_URL}")
print("-" * 50)
client = OpenAI(
api_key=API_KEY,
base_url=BASE_URL
)
# 測試對話
test_messages = [
"你好,請自我介紹",
"1 + 1 等於多少?",
"今天天氣如何?"
]
for msg in test_messages:
print(f"\n問: {msg}")
try:
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": msg}
],
temperature=0.7,
max_tokens=200
)
answer = response.choices[0].message.content
# 清理可能的思考標記
if "<think>" in answer:
answer = answer.split("</think>")[-1].strip()
if "<|channel|>" in answer:
answer = answer.split("<|message|>")[-1].strip()
print(f"答: {answer}")
except Exception as e:
print(f"錯誤: {str(e)[:100]}")
if __name__ == "__main__":
quick_test()

1
requirements.txt Normal file
View File

@@ -0,0 +1 @@
openai>=1.0.0

46
simple_llama_test.py Normal file
View File

@@ -0,0 +1,46 @@
import requests
import json
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "https://llama.theaken.com/v1/chat/completions"
def test_api():
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
data = {
"model": "gpt-oss-120b",
"messages": [
{"role": "user", "content": "Hello, can you respond?"}
],
"temperature": 0.7,
"max_tokens": 100
}
print("正在測試 API 連接...")
print(f"URL: {BASE_URL}")
print(f"Model: gpt-oss-120b")
print("-" * 50)
try:
response = requests.post(BASE_URL, headers=headers, json=data, timeout=30)
if response.status_code == 200:
result = response.json()
print("[成功] API 回應:")
print(result['choices'][0]['message']['content'])
else:
print(f"[錯誤] HTTP {response.status_code}")
print(f"回應內容: {response.text[:500]}")
except requests.exceptions.Timeout:
print("[錯誤] 請求超時")
except requests.exceptions.ConnectionError:
print("[錯誤] 無法連接到伺服器")
except Exception as e:
print(f"[錯誤] {str(e)}")
if __name__ == "__main__":
test_api()

143
test_all_models.py Normal file
View File

@@ -0,0 +1,143 @@
import requests
import json
import time
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "https://llama.theaken.com/v1"
MODELS = [
"gpt-oss-120b",
"deepseek-r1-671b",
"qwen3-embedding-8b"
]
def test_model(model_name):
"""測試單個模型"""
print(f"\n[測試模型: {model_name}]")
print("-" * 40)
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
# 測試聊天完成端點
chat_url = f"{BASE_URL}/chat/completions"
data = {
"model": model_name,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Say 'Hello, I am working!' if you can see this message."}
],
"temperature": 0.5,
"max_tokens": 50
}
try:
print(f"連接到: {chat_url}")
response = requests.post(chat_url, headers=headers, json=data, timeout=30)
print(f"HTTP 狀態碼: {response.status_code}")
if response.status_code == 200:
result = response.json()
if 'choices' in result and len(result['choices']) > 0:
content = result['choices'][0]['message']['content']
print(f"[SUCCESS] AI 回應: {content}")
return True
else:
print("[ERROR] 回應格式異常")
print(f"回應內容: {json.dumps(result, indent=2)}")
else:
print(f"[ERROR] 錯誤回應")
# 檢查是否是 HTML 錯誤頁面
if response.text.startswith('<!DOCTYPE'):
print("收到 HTML 錯誤頁面 (可能是 502 Bad Gateway)")
else:
print(f"回應內容: {response.text[:300]}")
except requests.exceptions.Timeout:
print("[TIMEOUT] 請求超時 (30秒)")
except requests.exceptions.ConnectionError as e:
print(f"[CONNECTION ERROR]: {str(e)[:100]}")
except Exception as e:
print(f"[UNEXPECTED ERROR]: {str(e)[:100]}")
return False
def test_api_endpoints():
"""測試不同的 API 端點"""
print("\n[測試 API 端點可用性]")
print("=" * 50)
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
# 測試不同的可能端點
endpoints = [
f"{BASE_URL}/models",
f"{BASE_URL}/chat/completions",
BASE_URL
]
for endpoint in endpoints:
try:
print(f"\n測試端點: {endpoint}")
response = requests.get(endpoint, headers=headers, timeout=10)
print(f" 狀態碼: {response.status_code}")
if response.status_code == 200:
print(" [OK] 端點可訪問")
# 如果是 JSON 回應,顯示部分內容
try:
data = response.json()
print(f" 回應類型: JSON")
if 'data' in data:
print(f" 包含 {len(data['data'])} 項資料")
except:
print(f" 回應類型: {response.headers.get('content-type', 'unknown')}")
elif response.status_code == 405:
print(" [OK] 端點存在 (但不支援 GET 方法)")
elif response.status_code == 502:
print(" [ERROR] 502 Bad Gateway - 伺服器暫時無法使用")
else:
print(f" [ERROR] 無法訪問")
except Exception as e:
print(f" [ERROR]: {str(e)[:50]}")
def main():
print("=" * 50)
print("Llama API 完整測試程式")
print("=" * 50)
print(f"API 基礎 URL: {BASE_URL}")
print(f"API 金鑰: {API_KEY[:10]}...{API_KEY[-5:]}")
# 首先測試端點可用性
test_api_endpoints()
print("\n" + "=" * 50)
print("開始測試各個模型")
print("=" * 50)
success_count = 0
for model in MODELS:
if test_model(model):
success_count += 1
time.sleep(1) # 避免請求過快
print("\n" + "=" * 50)
print(f"測試結果: {success_count}/{len(MODELS)} 個模型成功連接")
if success_count == 0:
print("\n可能的問題:")
print("1. API 伺服器暫時離線 (502 錯誤)")
print("2. API 金鑰可能不正確")
print("3. 網路連接問題")
print("4. 防火牆或代理設定")
print("\n建議稍後再試,或聯繫 API 提供者確認服務狀態。")
if __name__ == "__main__":
main()

111
test_with_timeout.py Normal file
View File

@@ -0,0 +1,111 @@
import requests
import json
from datetime import datetime
# API 配置
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "https://llama.theaken.com/v1"
def test_endpoints():
"""測試不同的 API 端點和模型"""
print("="*60)
print(f"Llama API 測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*60)
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
# 測試配置
tests = [
{
"name": "GPT-OSS-120B",
"model": "gpt-oss-120b",
"prompt": "Say hello in one word"
},
{
"name": "DeepSeek-R1-671B",
"model": "deepseek-r1-671b",
"prompt": "Say hello in one word"
},
{
"name": "Qwen3-Embedding-8B",
"model": "qwen3-embedding-8b",
"prompt": "Say hello in one word"
}
]
success_count = 0
for test in tests:
print(f"\n[測試 {test['name']}]")
print("-"*40)
data = {
"model": test["model"],
"messages": [
{"role": "user", "content": test["prompt"]}
],
"temperature": 0.5,
"max_tokens": 20
}
try:
# 使用較短的超時時間
response = requests.post(
f"{BASE_URL}/chat/completions",
headers=headers,
json=data,
timeout=15
)
print(f"HTTP 狀態: {response.status_code}")
if response.status_code == 200:
result = response.json()
if 'choices' in result:
content = result['choices'][0]['message']['content']
print(f"[SUCCESS] 成功回應: {content}")
success_count += 1
else:
print("[ERROR] 回應格式錯誤")
elif response.status_code == 502:
print("[ERROR] 502 Bad Gateway - 伺服器無法回應")
elif response.status_code == 401:
print("[ERROR] 401 Unauthorized - API 金鑰可能錯誤")
elif response.status_code == 404:
print("[ERROR] 404 Not Found - 模型或端點不存在")
else:
print(f"[ERROR] 錯誤 {response.status_code}")
if not response.text.startswith('<!DOCTYPE'):
print(f"詳情: {response.text[:200]}")
except requests.exceptions.Timeout:
print("[TIMEOUT] 請求超時 (15秒)")
except requests.exceptions.ConnectionError as e:
print(f"[CONNECTION ERROR] 無法連接到伺服器")
except Exception as e:
print(f"[UNKNOWN ERROR]: {str(e)[:100]}")
# 總結
print("\n" + "="*60)
print(f"測試結果: {success_count}/{len(tests)} 成功")
if success_count == 0:
print("\n診斷資訊:")
print("• 網路連接: 正常 (可 ping 通)")
print("• API 端點: https://llama.theaken.com/v1")
print("• 錯誤類型: 502 Bad Gateway")
print("• 可能原因: 後端 API 服務暫時離線")
print("\n建議行動:")
print("1. 稍後再試 (建議 10-30 分鐘後)")
print("2. 聯繫 API 管理員確認服務狀態")
print("3. 檢查是否有服務維護公告")
else:
print(f"\n[OK] API 服務正常運作中!")
print(f"[OK] 可使用的模型數: {success_count}")
if __name__ == "__main__":
test_endpoints()

33
使用說明.txt Normal file
View File

@@ -0,0 +1,33 @@
===========================================
Llama 模型對話測試程式 - 使用說明
===========================================
安裝步驟:
---------
1. 確保已安裝 Python 3.7 或更高版本
2. 安裝依賴套件:
pip install -r requirements.txt
執行程式:
---------
python llama_test.py
功能說明:
---------
1. 程式啟動後會自動測試 API 連接
2. 選擇要使用的模型 (1-3)
3. 開始與 AI 進行對話
4. 輸入 'exit' 或 'quit' 結束對話
可用模型:
---------
1. gpt-oss-120b (預設)
2. deepseek-r1-671b
3. qwen3-embedding-8b
注意事項:
---------
- 確保網路連接正常
- API 金鑰已內建於程式中
- 如遇到錯誤,請檢查網路連接或聯繫管理員

181
操作指南.md Normal file
View File

@@ -0,0 +1,181 @@
# Llama API 連接操作指南
## 一、API 連接資訊
### API 金鑰
```
paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=
```
### 可用端點
#### 內網端點(已測試成功)
| 端點名稱 | URL | 狀態 | 支援模型 |
|---------|-----|------|---------|
| 內網端點 1 | http://192.168.0.6:21180/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
| 內網端點 2 | http://192.168.0.6:21181/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
| 內網端點 3 | http://192.168.0.6:21182/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
| 內網端點 4 | http://192.168.0.6:21183/v1 | ❌ 錯誤 | 500 Internal Server Error |
#### 外網端點(待測試)
| 端點名稱 | URL | 狀態 | 支援模型 |
|---------|-----|------|---------|
| GPT-OSS 專用 | https://llama.theaken.com/v1/gpt-oss-120b | 待測試 | gpt-oss-120b |
| DeepSeek 專用 | https://llama.theaken.com/v1/deepseek-r1-671b | 待測試 | deepseek-r1-671b |
| 通用端點 | https://llama.theaken.com/v1 | 待測試 | 所有模型 |
## 二、快速開始
### 1. 安裝依賴
```bash
pip install openai
```
### 2. 測試連接Python
#### 內網連接範例
```python
from openai import OpenAI
# 設定 API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1" # 使用內網端點 1
# 創建客戶端
client = OpenAI(
api_key=API_KEY,
base_url=BASE_URL
)
# 發送請求
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "你好,請自我介紹"}
],
temperature=0.7,
max_tokens=200
)
# 顯示回應
print(response.choices[0].message.content)
```
## 三、使用現成程式
### 程式清單
1. **llama_full_api.py** - 完整對話程式(支援內外網)
2. **llama_chat.py** - 內網專用對話程式
3. **local_api_test.py** - 端點測試工具
4. **quick_test.py** - 快速測試腳本
### 執行對話程式
```bash
# 執行完整版(自動測試所有端點)
python llama_full_api.py
# 執行內網版
python llama_chat.py
# 快速測試
python quick_test.py
```
## 四、對話程式使用說明
### 基本操作
1. 執行程式後會自動測試可用端點
2. 選擇要使用的端點(輸入數字)
3. 選擇要使用的模型
4. 開始對話
### 對話中指令
- `exit``quit` - 結束對話
- `clear` - 清空對話歷史
- `model` - 切換模型
## 五、常見問題處理
### 問題 1502 Bad Gateway
**原因**:外網 API 伺服器離線
**解決**:使用內網端點
### 問題 2Connection Error
**原因**:不在內網環境或 IP 錯誤
**解決**
1. 確認在同一網路環境
2. 檢查防火牆設定
3. ping 192.168.0.6 測試連通性
### 問題 3編碼錯誤
**原因**Windows 終端編碼問題
**解決**:使用英文對話或修改終端編碼
### 問題 4回應包含特殊標記
**說明**:如 `<think>`, `<|channel|>`
**處理**:程式已自動過濾這些標記
## 六、API 回應格式清理
部分模型回應可能包含思考過程標記,程式會自動清理:
- `<think>...</think>` - 思考過程
- `<|channel|>...<|message|>` - 通道標記
- `<|end|>`, `<|start|>` - 結束/開始標記
## 七、測試結果摘要
### 成功測試
✅ 內網端點 1-3 全部正常運作
✅ 支援 OpenAI SDK 標準格式
✅ 可正常進行對話
### 待確認
- 外網端點需等待伺服器恢復
- DeepSeek 和 Qwen 模型需進一步測試
## 八、技術細節
### 使用 OpenAI SDK
```python
from openai import OpenAI
client = OpenAI(
api_key="你的金鑰",
base_url="API端點URL"
)
```
### 使用 requests 庫
```python
import requests
headers = {
"Authorization": "Bearer 你的金鑰",
"Content-Type": "application/json"
}
data = {
"model": "gpt-oss-120b",
"messages": [{"role": "user", "content": "你好"}],
"temperature": 0.7,
"max_tokens": 200
}
response = requests.post(
"API端點URL/chat/completions",
headers=headers,
json=data
)
```
## 九、建議使用方式
1. **開發測試**:使用內網端點(速度快、穩定)
2. **生產環境**:配置多個端點自動切換
3. **對話應用**:使用 llama_full_api.py
4. **API 整合**:參考 quick_test.py 的實現
---
最後更新2025-09-19
測試環境Windows / Python 3.13

14
連線參數.txt Normal file
View File

@@ -0,0 +1,14 @@
可以連接 llama 的模型ai進行對話
他的連線資料如下:
外網連線:
https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
https://llama.theaken.com/v1https://llama.theaken.com/v1/deepseek-r1-671b/
https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
外網模型路徑:
1. /gpt-oss-120b/
2. /deepseek-r1-671b/
3. /qwen3-embedding-8b/
金鑰paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=