Initial commit: Llama API Client with full documentation
- Added complete Python client for Llama AI models - Support for internal network endpoints (tested and working) - Support for external network endpoints (configured) - Interactive chat interface with multiple models - Automatic endpoint testing and failover - Response cleaning for special markers - Full documentation in English and Chinese - Complete test suite and examples - MIT License and contribution guidelines
This commit is contained in:
14
.claude/settings.local.json
Normal file
14
.claude/settings.local.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"permissions": {
|
||||
"allow": [
|
||||
"Bash(pip install:*)",
|
||||
"Bash(python:*)",
|
||||
"Bash(ping:*)",
|
||||
"Bash(curl:*)",
|
||||
"Bash(dir)",
|
||||
"Bash(git init:*)",
|
||||
"Bash(git add:*)"
|
||||
],
|
||||
"defaultMode": "acceptEdits"
|
||||
}
|
||||
}
|
102
.gitignore
vendored
Normal file
102
.gitignore
vendored
Normal file
@@ -0,0 +1,102 @@
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# Distribution / packaging
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
share/python-wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyInstaller
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
.nox/
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
*.py,cover
|
||||
.hypothesis/
|
||||
.pytest_cache/
|
||||
cover/
|
||||
|
||||
# Virtual environments
|
||||
venv/
|
||||
ENV/
|
||||
env/
|
||||
.venv/
|
||||
.env
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
.DS_Store
|
||||
|
||||
# Project specific
|
||||
*.log
|
||||
*.tmp
|
||||
temp/
|
||||
tmp/
|
||||
logs/
|
||||
output/
|
||||
|
||||
# API keys and secrets (if stored in separate config)
|
||||
config.ini
|
||||
secrets.json
|
||||
.env.local
|
||||
.env.production
|
||||
|
||||
# Test outputs
|
||||
test_results/
|
||||
*.test.txt
|
||||
|
||||
# Backup files
|
||||
*.bak
|
||||
*.backup
|
||||
*.old
|
||||
|
||||
# Windows
|
||||
Thumbs.db
|
||||
ehthumbs.db
|
||||
Desktop.ini
|
||||
|
||||
# macOS
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
|
||||
# Linux
|
||||
.directory
|
||||
.Trash-*
|
196
CONTRIBUTING.md
Normal file
196
CONTRIBUTING.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# Contributing to Llama API Client
|
||||
|
||||
Thank you for your interest in contributing to Llama API Client! This document provides guidelines for contributing to the project.
|
||||
|
||||
## How to Contribute
|
||||
|
||||
### Reporting Bugs
|
||||
|
||||
Before creating bug reports, please check existing issues to avoid duplicates. When creating a bug report, include:
|
||||
|
||||
- A clear and descriptive title
|
||||
- Steps to reproduce the issue
|
||||
- Expected behavior
|
||||
- Actual behavior
|
||||
- System information (OS, Python version, etc.)
|
||||
- Error messages or logs
|
||||
|
||||
### Suggesting Enhancements
|
||||
|
||||
Enhancement suggestions are welcome! Please provide:
|
||||
|
||||
- A clear and descriptive title
|
||||
- Detailed description of the proposed feature
|
||||
- Use cases and benefits
|
||||
- Possible implementation approach
|
||||
|
||||
### Pull Requests
|
||||
|
||||
1. **Fork the repository** and create your branch from `main`
|
||||
2. **Follow the coding style** used in the project
|
||||
3. **Write clear commit messages**
|
||||
4. **Add tests** if applicable
|
||||
5. **Update documentation** if needed
|
||||
6. **Test your changes** thoroughly
|
||||
|
||||
## Development Setup
|
||||
|
||||
```bash
|
||||
# Clone your fork
|
||||
git clone https://github.com/yourusername/llama-api-client.git
|
||||
cd llama-api-client
|
||||
|
||||
# Create virtual environment
|
||||
python -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Run tests
|
||||
python quick_test.py
|
||||
```
|
||||
|
||||
## Coding Standards
|
||||
|
||||
### Python Style Guide
|
||||
|
||||
- Follow PEP 8
|
||||
- Use meaningful variable names
|
||||
- Add docstrings to functions and classes
|
||||
- Keep functions focused and small
|
||||
- Handle exceptions appropriately
|
||||
|
||||
### Example Code Style
|
||||
|
||||
```python
|
||||
def clean_response(text: str) -> str:
|
||||
"""
|
||||
Clean AI response by removing special markers.
|
||||
|
||||
Args:
|
||||
text: Raw response text from AI
|
||||
|
||||
Returns:
|
||||
Cleaned text without special markers
|
||||
"""
|
||||
# Implementation here
|
||||
return cleaned_text
|
||||
```
|
||||
|
||||
### Commit Message Format
|
||||
|
||||
Use clear and descriptive commit messages:
|
||||
|
||||
- `feat:` New feature
|
||||
- `fix:` Bug fix
|
||||
- `docs:` Documentation changes
|
||||
- `style:` Code style changes
|
||||
- `refactor:` Code refactoring
|
||||
- `test:` Test additions or changes
|
||||
- `chore:` Maintenance tasks
|
||||
|
||||
Examples:
|
||||
```
|
||||
feat: Add support for new model endpoint
|
||||
fix: Handle encoding errors in Windows terminals
|
||||
docs: Update README with troubleshooting section
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Quick connection test
|
||||
python quick_test.py
|
||||
|
||||
# Test all models
|
||||
python test_all_models.py
|
||||
|
||||
# Test specific endpoint
|
||||
python local_api_test.py
|
||||
```
|
||||
|
||||
### Writing Tests
|
||||
|
||||
When adding new features, include appropriate tests:
|
||||
|
||||
```python
|
||||
def test_endpoint_connection():
|
||||
"""Test if endpoint is reachable"""
|
||||
assert test_endpoint({"url": "...", "models": ["..."]})
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
- Update README.md for user-facing changes
|
||||
- Update 操作指南.md for Chinese documentation
|
||||
- Add docstrings to all public functions
|
||||
- Include usage examples for new features
|
||||
|
||||
## Code Review Process
|
||||
|
||||
1. All submissions require review before merging
|
||||
2. Reviews focus on:
|
||||
- Code quality and style
|
||||
- Test coverage
|
||||
- Documentation completeness
|
||||
- Performance implications
|
||||
- Security considerations
|
||||
|
||||
## Areas for Contribution
|
||||
|
||||
### Current Needs
|
||||
|
||||
- [ ] Add retry logic for failed connections
|
||||
- [ ] Implement connection pooling
|
||||
- [ ] Add streaming response support
|
||||
- [ ] Create GUI interface
|
||||
- [ ] Add conversation export/import
|
||||
- [ ] Implement rate limiting
|
||||
- [ ] Add proxy support
|
||||
- [ ] Create Docker container
|
||||
- [ ] Add more language examples
|
||||
- [ ] Improve error messages
|
||||
|
||||
### Future Features
|
||||
|
||||
- Web interface
|
||||
- Mobile app support
|
||||
- Voice input/output
|
||||
- Multi-user support
|
||||
- Analytics dashboard
|
||||
- Plugin system
|
||||
|
||||
## Community
|
||||
|
||||
### Communication Channels
|
||||
|
||||
- GitHub Issues: Bug reports and feature requests
|
||||
- GitHub Discussions: General questions and discussions
|
||||
- Pull Requests: Code contributions
|
||||
|
||||
### Code of Conduct
|
||||
|
||||
- Be respectful and inclusive
|
||||
- Welcome newcomers
|
||||
- Provide constructive feedback
|
||||
- Focus on what is best for the community
|
||||
- Show empathy towards others
|
||||
|
||||
## Questions?
|
||||
|
||||
If you have questions about contributing, feel free to:
|
||||
|
||||
1. Open an issue with the `question` label
|
||||
2. Check existing documentation
|
||||
3. Review closed issues for similar questions
|
||||
|
||||
## License
|
||||
|
||||
By contributing, you agree that your contributions will be licensed under the MIT License.
|
||||
|
||||
---
|
||||
|
||||
Thank you for contributing to Llama API Client! 🚀
|
21
LICENSE
Normal file
21
LICENSE
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2025 Llama API Client Contributors
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
201
README.md
Normal file
201
README.md
Normal file
@@ -0,0 +1,201 @@
|
||||
# Llama API Client
|
||||
|
||||
A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.
|
||||
|
||||
## Features
|
||||
|
||||
- 🌐 Support for both internal network and external API endpoints
|
||||
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
|
||||
- 💬 Interactive chat interface with conversation history
|
||||
- 🔄 Automatic endpoint testing and failover
|
||||
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
|
||||
- 📝 Full conversation context management
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/yourusername/llama-api-client.git
|
||||
cd llama-api-client
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
# Configure API
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "http://192.168.0.6:21180/v1"
|
||||
|
||||
# Create client
|
||||
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
|
||||
|
||||
# Send request
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[{"role": "user", "content": "Hello!"}],
|
||||
temperature=0.7,
|
||||
max_tokens=200
|
||||
)
|
||||
|
||||
print(response.choices[0].message.content)
|
||||
```
|
||||
|
||||
### Run Interactive Chat
|
||||
|
||||
```bash
|
||||
# Full-featured chat with all endpoints
|
||||
python llama_full_api.py
|
||||
|
||||
# Internal network only
|
||||
python llama_chat.py
|
||||
|
||||
# Quick test
|
||||
python quick_test.py
|
||||
```
|
||||
|
||||
## Available Endpoints
|
||||
|
||||
### Internal Network (Tested & Working ✅)
|
||||
|
||||
| Endpoint | URL | Status |
|
||||
|----------|-----|--------|
|
||||
| Internal 1 | `http://192.168.0.6:21180/v1` | ✅ Working |
|
||||
| Internal 2 | `http://192.168.0.6:21181/v1` | ✅ Working |
|
||||
| Internal 3 | `http://192.168.0.6:21182/v1` | ✅ Working |
|
||||
| Internal 4 | `http://192.168.0.6:21183/v1` | ❌ Error 500 |
|
||||
|
||||
### External Network
|
||||
|
||||
| Endpoint | URL | Status |
|
||||
|----------|-----|--------|
|
||||
| GPT-OSS | `https://llama.theaken.com/v1/gpt-oss-120b` | 🔄 Pending |
|
||||
| DeepSeek | `https://llama.theaken.com/v1/deepseek-r1-671b` | 🔄 Pending |
|
||||
| General | `https://llama.theaken.com/v1` | 🔄 Pending |
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
llama-api-client/
|
||||
├── README.md # This file
|
||||
├── requirements.txt # Python dependencies
|
||||
├── 操作指南.md # Chinese operation guide
|
||||
├── llama_full_api.py # Full-featured chat client
|
||||
├── llama_chat.py # Internal network chat client
|
||||
├── local_api_test.py # Endpoint testing tool
|
||||
├── quick_test.py # Quick connection test
|
||||
├── test_all_models.py # Model testing script
|
||||
└── demo_chat.py # Demo chat with fallback
|
||||
```
|
||||
|
||||
## Chat Commands
|
||||
|
||||
During chat sessions, you can use these commands:
|
||||
|
||||
- `exit` or `quit` - End the conversation
|
||||
- `clear` - Clear conversation history
|
||||
- `model` - Switch between available models
|
||||
|
||||
## Configuration
|
||||
|
||||
### API Key
|
||||
```python
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
```
|
||||
|
||||
### Available Models
|
||||
- `gpt-oss-120b` - GPT Open Source 120B parameters
|
||||
- `deepseek-r1-671b` - DeepSeek R1 671B parameters
|
||||
- `qwen3-embedding-8b` - Qwen3 Embedding 8B parameters
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Issue: 502 Bad Gateway
|
||||
**Cause**: External API server is offline
|
||||
**Solution**: Use internal network endpoints
|
||||
|
||||
### Issue: Connection Error
|
||||
**Cause**: Not on internal network or incorrect IP
|
||||
**Solution**:
|
||||
1. Verify network connectivity: `ping 192.168.0.6`
|
||||
2. Check firewall settings
|
||||
3. Ensure you're on the same network
|
||||
|
||||
### Issue: Encoding Error
|
||||
**Cause**: Windows terminal encoding issues
|
||||
**Solution**: Use English for conversations or modify terminal encoding
|
||||
|
||||
### Issue: Response Contains Special Markers
|
||||
**Description**: Responses may contain `<think>`, `<|channel|>` tags
|
||||
**Solution**: The client automatically removes these markers
|
||||
|
||||
## Response Cleaning
|
||||
|
||||
The client automatically removes these special markers from AI responses:
|
||||
- `<think>...</think>` - Thinking process
|
||||
- `<|channel|>...<|message|>` - Channel markers
|
||||
- `<|end|>`, `<|start|>` - End/start markers
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.7+
|
||||
- openai>=1.0.0
|
||||
- requests (optional, for direct API calls)
|
||||
|
||||
## Development
|
||||
|
||||
### Testing Connection
|
||||
```python
|
||||
python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
|
||||
```
|
||||
|
||||
### Adding New Endpoints
|
||||
Edit `ENDPOINTS` dictionary in `llama_full_api.py`:
|
||||
```python
|
||||
ENDPOINTS = {
|
||||
"internal": [
|
||||
{
|
||||
"name": "New Endpoint",
|
||||
"url": "http://new-endpoint/v1",
|
||||
"models": ["gpt-oss-120b"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT License - See LICENSE file for details
|
||||
|
||||
## Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
|
||||
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
||||
4. Push to the branch (`git push origin feature/amazing-feature`)
|
||||
5. Open a Pull Request
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check the [操作指南.md](操作指南.md) for detailed Chinese documentation
|
||||
2. Open an issue on GitHub
|
||||
3. Contact the API administrator for server-related issues
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- Built with OpenAI Python SDK
|
||||
- Compatible with OpenAI API format
|
||||
- Supports multiple Llama model variants
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-09-19
|
||||
**Version**: 1.0.0
|
||||
**Status**: Internal endpoints working, external endpoints pending
|
124
demo_chat.py
Normal file
124
demo_chat.py
Normal file
@@ -0,0 +1,124 @@
|
||||
"""
|
||||
Llama API 對話程式 (示範版本)
|
||||
當 API 伺服器恢復後,可以使用此程式進行對話
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
import time
|
||||
|
||||
# API 設定
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "https://llama.theaken.com/v1"
|
||||
|
||||
def simulate_chat():
|
||||
"""模擬對話功能(用於展示)"""
|
||||
print("\n" + "="*50)
|
||||
print("Llama AI 對話系統 - 示範模式")
|
||||
print("="*50)
|
||||
print("\n[注意] API 伺服器目前離線,以下為模擬對話")
|
||||
print("當伺服器恢復後,將自動連接真實 API\n")
|
||||
|
||||
# 模擬回應
|
||||
demo_responses = [
|
||||
"你好!我是 Llama AI 助手,很高興為你服務。",
|
||||
"這是一個示範回應。當 API 伺服器恢復後,你將收到真實的 AI 回應。",
|
||||
"我可以回答問題、協助編程、翻譯文字等多種任務。",
|
||||
"請問有什麼我可以幫助你的嗎?"
|
||||
]
|
||||
|
||||
response_index = 0
|
||||
print("輸入 'exit' 結束對話\n")
|
||||
|
||||
while True:
|
||||
user_input = input("你: ").strip()
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("\n再見!")
|
||||
break
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
# 模擬思考時間
|
||||
print("\nAI 思考中", end="")
|
||||
for _ in range(3):
|
||||
time.sleep(0.3)
|
||||
print(".", end="", flush=True)
|
||||
print()
|
||||
|
||||
# 顯示模擬回應
|
||||
print(f"\nAI: {demo_responses[response_index % len(demo_responses)]}")
|
||||
response_index += 1
|
||||
|
||||
def real_chat():
|
||||
"""實際對話功能(當 API 可用時)"""
|
||||
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
|
||||
|
||||
print("\n" + "="*50)
|
||||
print("Llama AI 對話系統")
|
||||
print("="*50)
|
||||
print("\n已連接到 Llama API")
|
||||
print("輸入 'exit' 結束對話\n")
|
||||
|
||||
messages = []
|
||||
|
||||
while True:
|
||||
user_input = input("你: ").strip()
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("\n再見!")
|
||||
break
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
try:
|
||||
print("\nAI 思考中...")
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
max_tokens=1000
|
||||
)
|
||||
|
||||
ai_response = response.choices[0].message.content
|
||||
print(f"\nAI: {ai_response}")
|
||||
messages.append({"role": "assistant", "content": ai_response})
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n[錯誤] {str(e)[:100]}")
|
||||
print("無法取得回應,請稍後再試")
|
||||
|
||||
def main():
|
||||
print("檢查 API 連接狀態...")
|
||||
|
||||
# 嘗試連接 API
|
||||
try:
|
||||
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
|
||||
|
||||
# 快速測試
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[{"role": "user", "content": "test"}],
|
||||
max_tokens=10,
|
||||
timeout=5
|
||||
)
|
||||
print("[成功] API 已連接")
|
||||
real_chat()
|
||||
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
if "502" in error_msg or "Bad gateway" in error_msg:
|
||||
print("[提示] API 伺服器目前離線 (502 錯誤)")
|
||||
print("進入示範模式...")
|
||||
simulate_chat()
|
||||
else:
|
||||
print(f"[錯誤] 無法連接: {error_msg[:100]}")
|
||||
print("\n是否要進入示範模式? (y/n): ", end="")
|
||||
if input().lower() == 'y':
|
||||
simulate_chat()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
196
llama_chat.py
Normal file
196
llama_chat.py
Normal file
@@ -0,0 +1,196 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Llama 內網 API 對話程式
|
||||
支援多個端點和模型選擇
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
import sys
|
||||
import re
|
||||
|
||||
# API 配置
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
|
||||
# 可用端點 (前 3 個已測試可用)
|
||||
ENDPOINTS = [
|
||||
"http://192.168.0.6:21180/v1",
|
||||
"http://192.168.0.6:21181/v1",
|
||||
"http://192.168.0.6:21182/v1",
|
||||
"http://192.168.0.6:21183/v1"
|
||||
]
|
||||
|
||||
# 模型列表
|
||||
MODELS = [
|
||||
"gpt-oss-120b",
|
||||
"deepseek-r1-671b",
|
||||
"qwen3-embedding-8b"
|
||||
]
|
||||
|
||||
def clean_response(text):
|
||||
"""清理 AI 回應中的特殊標記"""
|
||||
# 移除思考標記
|
||||
if "<think>" in text:
|
||||
text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
|
||||
|
||||
# 移除 channel 標記
|
||||
if "<|channel|>" in text:
|
||||
parts = text.split("<|message|>")
|
||||
if len(parts) > 1:
|
||||
text = parts[-1]
|
||||
|
||||
# 移除結束標記
|
||||
text = text.replace("<|end|>", "").replace("<|start|>", "")
|
||||
|
||||
# 清理多餘空白
|
||||
text = text.strip()
|
||||
|
||||
return text
|
||||
|
||||
def test_endpoint(endpoint):
|
||||
"""測試端點是否可用"""
|
||||
try:
|
||||
client = OpenAI(api_key=API_KEY, base_url=endpoint)
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[{"role": "user", "content": "Hi"}],
|
||||
max_tokens=10,
|
||||
timeout=5
|
||||
)
|
||||
return True
|
||||
except:
|
||||
return False
|
||||
|
||||
def chat_session(endpoint, model):
|
||||
"""對話主程式"""
|
||||
print("\n" + "="*60)
|
||||
print("Llama AI 對話系統")
|
||||
print("="*60)
|
||||
print(f"端點: {endpoint}")
|
||||
print(f"模型: {model}")
|
||||
print("\n指令:")
|
||||
print(" exit/quit - 結束對話")
|
||||
print(" clear - 清空對話歷史")
|
||||
print(" model - 切換模型")
|
||||
print("-"*60)
|
||||
|
||||
client = OpenAI(api_key=API_KEY, base_url=endpoint)
|
||||
messages = []
|
||||
|
||||
while True:
|
||||
try:
|
||||
user_input = input("\n你: ").strip()
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("再見!")
|
||||
break
|
||||
|
||||
if user_input.lower() == 'clear':
|
||||
messages = []
|
||||
print("[系統] 對話歷史已清空")
|
||||
continue
|
||||
|
||||
if user_input.lower() == 'model':
|
||||
print("\n可用模型:")
|
||||
for i, m in enumerate(MODELS, 1):
|
||||
print(f" {i}. {m}")
|
||||
choice = input("選擇 (1-3): ").strip()
|
||||
if choice in ['1', '2', '3']:
|
||||
model = MODELS[int(choice)-1]
|
||||
print(f"[系統] 已切換到 {model}")
|
||||
continue
|
||||
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
print("\nAI 思考中...", end="", flush=True)
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
max_tokens=1000
|
||||
)
|
||||
|
||||
ai_response = response.choices[0].message.content
|
||||
ai_response = clean_response(ai_response)
|
||||
|
||||
print("\r" + " "*20 + "\r", end="") # 清除 "思考中..."
|
||||
print(f"AI: {ai_response}")
|
||||
|
||||
messages.append({"role": "assistant", "content": ai_response})
|
||||
|
||||
except UnicodeEncodeError:
|
||||
print("\r[錯誤] 編碼問題,請使用英文對話")
|
||||
messages.pop() # 移除最後的用戶訊息
|
||||
except Exception as e:
|
||||
print(f"\r[錯誤] {str(e)[:100]}")
|
||||
messages.pop() # 移除最後的用戶訊息
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n[中斷] 使用 exit 命令正常退出")
|
||||
continue
|
||||
except EOFError:
|
||||
print("\n再見!")
|
||||
break
|
||||
|
||||
def main():
|
||||
print("="*60)
|
||||
print("Llama 內網 API 對話程式")
|
||||
print("="*60)
|
||||
|
||||
# 測試端點
|
||||
print("\n正在檢查可用端點...")
|
||||
available = []
|
||||
for i, endpoint in enumerate(ENDPOINTS[:3], 1): # 只測試前3個
|
||||
print(f" 測試 {endpoint}...", end="", flush=True)
|
||||
if test_endpoint(endpoint):
|
||||
print(" [OK]")
|
||||
available.append(endpoint)
|
||||
else:
|
||||
print(" [失敗]")
|
||||
|
||||
if not available:
|
||||
print("\n[錯誤] 沒有可用的端點")
|
||||
sys.exit(1)
|
||||
|
||||
# 選擇端點
|
||||
if len(available) == 1:
|
||||
selected_endpoint = available[0]
|
||||
print(f"\n使用端點: {selected_endpoint}")
|
||||
else:
|
||||
print(f"\n找到 {len(available)} 個可用端點:")
|
||||
for i, ep in enumerate(available, 1):
|
||||
print(f" {i}. {ep}")
|
||||
print("\n選擇端點 (預設: 1): ", end="")
|
||||
choice = input().strip()
|
||||
if choice and choice.isdigit() and 1 <= int(choice) <= len(available):
|
||||
selected_endpoint = available[int(choice)-1]
|
||||
else:
|
||||
selected_endpoint = available[0]
|
||||
|
||||
# 選擇模型
|
||||
print("\n可用模型:")
|
||||
for i, model in enumerate(MODELS, 1):
|
||||
print(f" {i}. {model}")
|
||||
print("\n選擇模型 (預設: 1): ", end="")
|
||||
choice = input().strip()
|
||||
if choice in ['1', '2', '3']:
|
||||
selected_model = MODELS[int(choice)-1]
|
||||
else:
|
||||
selected_model = MODELS[0]
|
||||
|
||||
# 開始對話
|
||||
chat_session(selected_endpoint, selected_model)
|
||||
|
||||
if __name__ == "__main__":
|
||||
try:
|
||||
main()
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n程式已退出")
|
||||
except Exception as e:
|
||||
print(f"\n[錯誤] {e}")
|
||||
sys.exit(1)
|
293
llama_full_api.py
Normal file
293
llama_full_api.py
Normal file
@@ -0,0 +1,293 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Llama API 完整對話程式
|
||||
支援內網和外網端點
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
import requests
|
||||
import sys
|
||||
import re
|
||||
from datetime import datetime
|
||||
|
||||
# API 金鑰
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
|
||||
# API 端點配置
|
||||
ENDPOINTS = {
|
||||
"內網": [
|
||||
{
|
||||
"name": "內網端點 1 (21180)",
|
||||
"url": "http://192.168.0.6:21180/v1",
|
||||
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
|
||||
},
|
||||
{
|
||||
"name": "內網端點 2 (21181)",
|
||||
"url": "http://192.168.0.6:21181/v1",
|
||||
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
|
||||
},
|
||||
{
|
||||
"name": "內網端點 3 (21182)",
|
||||
"url": "http://192.168.0.6:21182/v1",
|
||||
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
|
||||
}
|
||||
],
|
||||
"外網": [
|
||||
{
|
||||
"name": "外網 GPT-OSS-120B",
|
||||
"url": "https://llama.theaken.com/v1/gpt-oss-120b",
|
||||
"models": ["gpt-oss-120b"]
|
||||
},
|
||||
{
|
||||
"name": "外網 DeepSeek-R1-671B",
|
||||
"url": "https://llama.theaken.com/v1/deepseek-r1-671b",
|
||||
"models": ["deepseek-r1-671b"]
|
||||
},
|
||||
{
|
||||
"name": "外網通用端點",
|
||||
"url": "https://llama.theaken.com/v1",
|
||||
"models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
def clean_response(text):
|
||||
"""清理 AI 回應中的特殊標記"""
|
||||
# 移除思考標記
|
||||
if "<think>" in text:
|
||||
text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
|
||||
|
||||
# 移除 channel 標記
|
||||
if "<|channel|>" in text:
|
||||
parts = text.split("<|message|>")
|
||||
if len(parts) > 1:
|
||||
text = parts[-1]
|
||||
|
||||
# 移除結束標記
|
||||
text = text.replace("<|end|>", "").replace("<|start|>", "")
|
||||
|
||||
# 清理多餘空白
|
||||
text = text.strip()
|
||||
|
||||
return text
|
||||
|
||||
def test_endpoint(endpoint_info):
|
||||
"""測試端點是否可用"""
|
||||
url = endpoint_info["url"]
|
||||
model = endpoint_info["models"][0] # 使用第一個模型測試
|
||||
|
||||
try:
|
||||
# 對於特定模型的 URL,需要特殊處理
|
||||
if "/gpt-oss-120b" in url or "/deepseek-r1-671b" in url:
|
||||
# 這些可能是特定模型的端點
|
||||
base_url = url.rsplit("/", 1)[0] # 移除模型名稱部分
|
||||
else:
|
||||
base_url = url
|
||||
|
||||
client = OpenAI(api_key=API_KEY, base_url=base_url)
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[{"role": "user", "content": "test"}],
|
||||
max_tokens=5,
|
||||
timeout=8
|
||||
)
|
||||
return True
|
||||
except Exception as e:
|
||||
# 也嘗試使用 requests 直接測試
|
||||
try:
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
test_url = f"{url}/chat/completions" if not url.endswith("/chat/completions") else url
|
||||
data = {
|
||||
"model": model,
|
||||
"messages": [{"role": "user", "content": "test"}],
|
||||
"max_tokens": 5
|
||||
}
|
||||
|
||||
response = requests.post(test_url, headers=headers, json=data, timeout=8)
|
||||
return response.status_code == 200
|
||||
except:
|
||||
return False
|
||||
|
||||
def test_all_endpoints():
|
||||
"""測試所有端點"""
|
||||
print("\n" + "="*60)
|
||||
print("測試 API 端點連接")
|
||||
print("="*60)
|
||||
|
||||
available_endpoints = []
|
||||
|
||||
# 測試內網端點
|
||||
print("\n[內網端點測試]")
|
||||
for endpoint in ENDPOINTS["內網"]:
|
||||
print(f" 測試 {endpoint['name']}...", end="", flush=True)
|
||||
if test_endpoint(endpoint):
|
||||
print(" [OK]")
|
||||
available_endpoints.append(("內網", endpoint))
|
||||
else:
|
||||
print(" [FAIL]")
|
||||
|
||||
# 測試外網端點
|
||||
print("\n[外網端點測試]")
|
||||
for endpoint in ENDPOINTS["外網"]:
|
||||
print(f" 測試 {endpoint['name']}...", end="", flush=True)
|
||||
if test_endpoint(endpoint):
|
||||
print(" [OK]")
|
||||
available_endpoints.append(("外網", endpoint))
|
||||
else:
|
||||
print(" [FAIL]")
|
||||
|
||||
return available_endpoints
|
||||
|
||||
def chat_session(endpoint_info):
|
||||
"""對話主程式"""
|
||||
print("\n" + "="*60)
|
||||
print("Llama AI 對話系統")
|
||||
print("="*60)
|
||||
print(f"端點: {endpoint_info['name']}")
|
||||
print(f"URL: {endpoint_info['url']}")
|
||||
print(f"可用模型: {', '.join(endpoint_info['models'])}")
|
||||
print("\n指令:")
|
||||
print(" exit/quit - 結束對話")
|
||||
print(" clear - 清空對話歷史")
|
||||
print(" model - 切換模型")
|
||||
print("-"*60)
|
||||
|
||||
# 處理 URL
|
||||
url = endpoint_info["url"]
|
||||
if "/gpt-oss-120b" in url or "/deepseek-r1-671b" in url:
|
||||
base_url = url.rsplit("/", 1)[0]
|
||||
else:
|
||||
base_url = url
|
||||
|
||||
client = OpenAI(api_key=API_KEY, base_url=base_url)
|
||||
|
||||
# 選擇初始模型
|
||||
if len(endpoint_info['models']) == 1:
|
||||
current_model = endpoint_info['models'][0]
|
||||
else:
|
||||
print("\n選擇模型:")
|
||||
for i, model in enumerate(endpoint_info['models'], 1):
|
||||
print(f" {i}. {model}")
|
||||
choice = input("選擇 (預設: 1): ").strip()
|
||||
if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
|
||||
current_model = endpoint_info['models'][int(choice)-1]
|
||||
else:
|
||||
current_model = endpoint_info['models'][0]
|
||||
|
||||
print(f"\n使用模型: {current_model}")
|
||||
messages = []
|
||||
|
||||
while True:
|
||||
try:
|
||||
user_input = input("\n你: ").strip()
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("再見!")
|
||||
break
|
||||
|
||||
if user_input.lower() == 'clear':
|
||||
messages = []
|
||||
print("[系統] 對話歷史已清空")
|
||||
continue
|
||||
|
||||
if user_input.lower() == 'model':
|
||||
if len(endpoint_info['models']) == 1:
|
||||
print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
|
||||
else:
|
||||
print("\n可用模型:")
|
||||
for i, m in enumerate(endpoint_info['models'], 1):
|
||||
print(f" {i}. {m}")
|
||||
choice = input("選擇: ").strip()
|
||||
if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
|
||||
current_model = endpoint_info['models'][int(choice)-1]
|
||||
print(f"[系統] 已切換到 {current_model}")
|
||||
continue
|
||||
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
print("\nAI 思考中...", end="", flush=True)
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model=current_model,
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
max_tokens=1000
|
||||
)
|
||||
|
||||
ai_response = response.choices[0].message.content
|
||||
ai_response = clean_response(ai_response)
|
||||
|
||||
print("\r" + " "*20 + "\r", end="")
|
||||
print(f"AI: {ai_response}")
|
||||
|
||||
messages.append({"role": "assistant", "content": ai_response})
|
||||
|
||||
except Exception as e:
|
||||
print(f"\r[錯誤] {str(e)[:100]}")
|
||||
messages.pop()
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n[中斷] 使用 exit 命令正常退出")
|
||||
continue
|
||||
except EOFError:
|
||||
print("\n再見!")
|
||||
break
|
||||
|
||||
def main():
|
||||
print("="*60)
|
||||
print("Llama API 完整對話程式")
|
||||
print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
print("="*60)
|
||||
|
||||
# 測試所有端點
|
||||
available = test_all_endpoints()
|
||||
|
||||
if not available:
|
||||
print("\n[錯誤] 沒有可用的端點")
|
||||
print("\n可能的原因:")
|
||||
print("1. 網路連接問題")
|
||||
print("2. API 服務離線")
|
||||
print("3. 防火牆阻擋")
|
||||
sys.exit(1)
|
||||
|
||||
# 顯示可用端點
|
||||
print("\n" + "="*60)
|
||||
print(f"找到 {len(available)} 個可用端點:")
|
||||
print("="*60)
|
||||
|
||||
for i, (network_type, endpoint) in enumerate(available, 1):
|
||||
print(f"{i}. [{network_type}] {endpoint['name']}")
|
||||
print(f" URL: {endpoint['url']}")
|
||||
print(f" 模型: {', '.join(endpoint['models'])}")
|
||||
|
||||
# 選擇端點
|
||||
print("\n選擇端點 (預設: 1): ", end="")
|
||||
choice = input().strip()
|
||||
|
||||
if choice.isdigit() and 1 <= int(choice) <= len(available):
|
||||
selected = available[int(choice)-1][1]
|
||||
else:
|
||||
selected = available[0][1]
|
||||
|
||||
# 開始對話
|
||||
chat_session(selected)
|
||||
|
||||
if __name__ == "__main__":
|
||||
try:
|
||||
main()
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n程式已退出")
|
||||
except Exception as e:
|
||||
print(f"\n[錯誤] {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
99
llama_test.py
Normal file
99
llama_test.py
Normal file
@@ -0,0 +1,99 @@
|
||||
from openai import OpenAI
|
||||
import sys
|
||||
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "https://llama.theaken.com/v1"
|
||||
|
||||
AVAILABLE_MODELS = [
|
||||
"gpt-oss-120b",
|
||||
"deepseek-r1-671b",
|
||||
"qwen3-embedding-8b"
|
||||
]
|
||||
|
||||
def chat_with_llama(model_name="gpt-oss-120b"):
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=BASE_URL
|
||||
)
|
||||
|
||||
print(f"\n使用模型: {model_name}")
|
||||
print("-" * 50)
|
||||
print("輸入 'exit' 或 'quit' 來結束對話")
|
||||
print("-" * 50)
|
||||
|
||||
messages = []
|
||||
|
||||
while True:
|
||||
user_input = input("\n你: ").strip()
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("對話結束")
|
||||
break
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model=model_name,
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
max_tokens=2000
|
||||
)
|
||||
|
||||
assistant_reply = response.choices[0].message.content
|
||||
print(f"\nAI: {assistant_reply}")
|
||||
|
||||
messages.append({"role": "assistant", "content": assistant_reply})
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n錯誤: {str(e)}")
|
||||
print("請檢查網路連接和 API 設定")
|
||||
|
||||
def test_connection():
|
||||
print("測試連接到 Llama API...")
|
||||
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=BASE_URL
|
||||
)
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[{"role": "user", "content": "Hello, this is a test message."}],
|
||||
max_tokens=50
|
||||
)
|
||||
print("[OK] 連接成功!")
|
||||
print(f"測試回應: {response.choices[0].message.content}")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"[ERROR] 連接失敗: {str(e)[:200]}")
|
||||
return False
|
||||
|
||||
def main():
|
||||
print("=" * 50)
|
||||
print("Llama 模型對話測試程式")
|
||||
print("=" * 50)
|
||||
|
||||
print("\n可用的模型:")
|
||||
for i, model in enumerate(AVAILABLE_MODELS, 1):
|
||||
print(f" {i}. {model}")
|
||||
|
||||
if test_connection():
|
||||
print("\n選擇要使用的模型 (輸入數字 1-3,預設: 1):")
|
||||
choice = input().strip()
|
||||
|
||||
if choice == "2":
|
||||
model = AVAILABLE_MODELS[1]
|
||||
elif choice == "3":
|
||||
model = AVAILABLE_MODELS[2]
|
||||
else:
|
||||
model = AVAILABLE_MODELS[0]
|
||||
|
||||
chat_with_llama(model)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
243
local_api_test.py
Normal file
243
local_api_test.py
Normal file
@@ -0,0 +1,243 @@
|
||||
"""
|
||||
內網 Llama API 測試程式
|
||||
使用 OpenAI 相容格式連接到本地 API 端點
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
import requests
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
# API 配置
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
|
||||
# 內網端點列表
|
||||
LOCAL_ENDPOINTS = [
|
||||
"http://192.168.0.6:21180/v1",
|
||||
"http://192.168.0.6:21181/v1",
|
||||
"http://192.168.0.6:21182/v1",
|
||||
"http://192.168.0.6:21183/v1"
|
||||
]
|
||||
|
||||
# 可用模型
|
||||
MODELS = [
|
||||
"gpt-oss-120b",
|
||||
"deepseek-r1-671b",
|
||||
"qwen3-embedding-8b"
|
||||
]
|
||||
|
||||
def test_endpoint_with_requests(endpoint, model="gpt-oss-120b"):
|
||||
"""使用 requests 測試端點"""
|
||||
print(f"\n[使用 requests 測試]")
|
||||
print(f"端點: {endpoint}")
|
||||
print(f"模型: {model}")
|
||||
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
data = {
|
||||
"model": model,
|
||||
"messages": [
|
||||
{"role": "user", "content": "Say 'Hello, I am working!' if you can see this."}
|
||||
],
|
||||
"temperature": 0.7,
|
||||
"max_tokens": 50
|
||||
}
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{endpoint}/chat/completions",
|
||||
headers=headers,
|
||||
json=data,
|
||||
timeout=10
|
||||
)
|
||||
|
||||
print(f"HTTP 狀態碼: {response.status_code}")
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
if 'choices' in result:
|
||||
content = result['choices'][0]['message']['content']
|
||||
print(f"[SUCCESS] AI 回應: {content}")
|
||||
return True
|
||||
else:
|
||||
print("[ERROR] 回應格式不正確")
|
||||
else:
|
||||
print(f"[ERROR] HTTP {response.status_code}")
|
||||
if response.status_code != 502: # 避免顯示 HTML 錯誤頁
|
||||
print(f"詳情: {response.text[:200]}")
|
||||
|
||||
except requests.exceptions.ConnectTimeout:
|
||||
print("[TIMEOUT] 連接超時")
|
||||
except requests.exceptions.ConnectionError:
|
||||
print("[CONNECTION ERROR] 無法連接到端點")
|
||||
except Exception as e:
|
||||
print(f"[ERROR] {str(e)[:100]}")
|
||||
|
||||
return False
|
||||
|
||||
def test_endpoint_with_openai(endpoint, model="gpt-oss-120b"):
|
||||
"""使用 OpenAI SDK 測試端點"""
|
||||
print(f"\n[使用 OpenAI SDK 測試]")
|
||||
print(f"端點: {endpoint}")
|
||||
print(f"模型: {model}")
|
||||
|
||||
try:
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=endpoint,
|
||||
timeout=10.0
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[
|
||||
{"role": "user", "content": "Hello, please respond with a simple greeting."}
|
||||
],
|
||||
temperature=0.7,
|
||||
max_tokens=50
|
||||
)
|
||||
|
||||
content = response.choices[0].message.content
|
||||
print(f"[SUCCESS] AI 回應: {content}")
|
||||
return True, client
|
||||
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
if "Connection error" in error_str:
|
||||
print("[CONNECTION ERROR] 無法連接到端點")
|
||||
elif "timeout" in error_str.lower():
|
||||
print("[TIMEOUT] 請求超時")
|
||||
elif "502" in error_str:
|
||||
print("[ERROR] 502 Bad Gateway")
|
||||
else:
|
||||
print(f"[ERROR] {error_str[:100]}")
|
||||
|
||||
return False, None
|
||||
|
||||
def find_working_endpoint():
|
||||
"""尋找可用的端點"""
|
||||
print("="*60)
|
||||
print(f"內網 API 端點測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
print("="*60)
|
||||
|
||||
working_endpoints = []
|
||||
|
||||
for endpoint in LOCAL_ENDPOINTS:
|
||||
print(f"\n測試端點: {endpoint}")
|
||||
print("-"*40)
|
||||
|
||||
# 先用 requests 快速測試
|
||||
if test_endpoint_with_requests(endpoint):
|
||||
working_endpoints.append(endpoint)
|
||||
print(f"[OK] 端點 {endpoint} 可用!")
|
||||
else:
|
||||
# 再用 OpenAI SDK 測試
|
||||
success, _ = test_endpoint_with_openai(endpoint)
|
||||
if success:
|
||||
working_endpoints.append(endpoint)
|
||||
print(f"[OK] 端點 {endpoint} 可用!")
|
||||
|
||||
return working_endpoints
|
||||
|
||||
def interactive_chat(endpoint, model="gpt-oss-120b"):
|
||||
"""互動式對話"""
|
||||
print(f"\n連接到: {endpoint}")
|
||||
print(f"使用模型: {model}")
|
||||
print("="*60)
|
||||
print("開始對話 (輸入 'exit' 結束)")
|
||||
print("="*60)
|
||||
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=endpoint
|
||||
)
|
||||
|
||||
messages = []
|
||||
|
||||
while True:
|
||||
user_input = input("\n你: ").strip()
|
||||
|
||||
if user_input.lower() in ['exit', 'quit']:
|
||||
print("對話結束")
|
||||
break
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
try:
|
||||
print("\nAI 思考中...")
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
max_tokens=1000
|
||||
)
|
||||
|
||||
ai_response = response.choices[0].message.content
|
||||
print(f"\nAI: {ai_response}")
|
||||
messages.append({"role": "assistant", "content": ai_response})
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n[ERROR] {str(e)[:100]}")
|
||||
|
||||
def main():
|
||||
# 尋找可用端點
|
||||
working_endpoints = find_working_endpoint()
|
||||
|
||||
print("\n" + "="*60)
|
||||
print("測試結果總結")
|
||||
print("="*60)
|
||||
|
||||
if working_endpoints:
|
||||
print(f"\n找到 {len(working_endpoints)} 個可用端點:")
|
||||
for i, endpoint in enumerate(working_endpoints, 1):
|
||||
print(f" {i}. {endpoint}")
|
||||
|
||||
# 選擇端點
|
||||
if len(working_endpoints) == 1:
|
||||
selected_endpoint = working_endpoints[0]
|
||||
print(f"\n自動選擇唯一可用端點: {selected_endpoint}")
|
||||
else:
|
||||
print(f"\n請選擇要使用的端點 (1-{len(working_endpoints)}):")
|
||||
choice = input().strip()
|
||||
try:
|
||||
idx = int(choice) - 1
|
||||
if 0 <= idx < len(working_endpoints):
|
||||
selected_endpoint = working_endpoints[idx]
|
||||
else:
|
||||
selected_endpoint = working_endpoints[0]
|
||||
except:
|
||||
selected_endpoint = working_endpoints[0]
|
||||
|
||||
# 選擇模型
|
||||
print("\n可用模型:")
|
||||
for i, model in enumerate(MODELS, 1):
|
||||
print(f" {i}. {model}")
|
||||
|
||||
print("\n請選擇模型 (1-3, 預設: 1):")
|
||||
choice = input().strip()
|
||||
if choice == "2":
|
||||
selected_model = MODELS[1]
|
||||
elif choice == "3":
|
||||
selected_model = MODELS[2]
|
||||
else:
|
||||
selected_model = MODELS[0]
|
||||
|
||||
# 開始對話
|
||||
interactive_chat(selected_endpoint, selected_model)
|
||||
|
||||
else:
|
||||
print("\n[ERROR] 沒有找到可用的端點")
|
||||
print("\n可能的原因:")
|
||||
print("1. 內網 API 服務未啟動")
|
||||
print("2. 防火牆阻擋了連接")
|
||||
print("3. IP 地址或端口設定錯誤")
|
||||
print("4. 不在同一個網路環境")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
54
quick_test.py
Normal file
54
quick_test.py
Normal file
@@ -0,0 +1,54 @@
|
||||
"""
|
||||
快速測試內網 Llama API
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
|
||||
# API 設定
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "http://192.168.0.6:21180/v1" # 使用第一個可用端點
|
||||
|
||||
def quick_test():
|
||||
print("連接到內網 API...")
|
||||
print(f"端點: {BASE_URL}")
|
||||
print("-" * 50)
|
||||
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=BASE_URL
|
||||
)
|
||||
|
||||
# 測試對話
|
||||
test_messages = [
|
||||
"你好,請自我介紹",
|
||||
"1 + 1 等於多少?",
|
||||
"今天天氣如何?"
|
||||
]
|
||||
|
||||
for msg in test_messages:
|
||||
print(f"\n問: {msg}")
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[
|
||||
{"role": "user", "content": msg}
|
||||
],
|
||||
temperature=0.7,
|
||||
max_tokens=200
|
||||
)
|
||||
|
||||
answer = response.choices[0].message.content
|
||||
# 清理可能的思考標記
|
||||
if "<think>" in answer:
|
||||
answer = answer.split("</think>")[-1].strip()
|
||||
if "<|channel|>" in answer:
|
||||
answer = answer.split("<|message|>")[-1].strip()
|
||||
|
||||
print(f"答: {answer}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"錯誤: {str(e)[:100]}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
quick_test()
|
1
requirements.txt
Normal file
1
requirements.txt
Normal file
@@ -0,0 +1 @@
|
||||
openai>=1.0.0
|
46
simple_llama_test.py
Normal file
46
simple_llama_test.py
Normal file
@@ -0,0 +1,46 @@
|
||||
import requests
|
||||
import json
|
||||
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "https://llama.theaken.com/v1/chat/completions"
|
||||
|
||||
def test_api():
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
data = {
|
||||
"model": "gpt-oss-120b",
|
||||
"messages": [
|
||||
{"role": "user", "content": "Hello, can you respond?"}
|
||||
],
|
||||
"temperature": 0.7,
|
||||
"max_tokens": 100
|
||||
}
|
||||
|
||||
print("正在測試 API 連接...")
|
||||
print(f"URL: {BASE_URL}")
|
||||
print(f"Model: gpt-oss-120b")
|
||||
print("-" * 50)
|
||||
|
||||
try:
|
||||
response = requests.post(BASE_URL, headers=headers, json=data, timeout=30)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
print("[成功] API 回應:")
|
||||
print(result['choices'][0]['message']['content'])
|
||||
else:
|
||||
print(f"[錯誤] HTTP {response.status_code}")
|
||||
print(f"回應內容: {response.text[:500]}")
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
print("[錯誤] 請求超時")
|
||||
except requests.exceptions.ConnectionError:
|
||||
print("[錯誤] 無法連接到伺服器")
|
||||
except Exception as e:
|
||||
print(f"[錯誤] {str(e)}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
test_api()
|
143
test_all_models.py
Normal file
143
test_all_models.py
Normal file
@@ -0,0 +1,143 @@
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "https://llama.theaken.com/v1"
|
||||
|
||||
MODELS = [
|
||||
"gpt-oss-120b",
|
||||
"deepseek-r1-671b",
|
||||
"qwen3-embedding-8b"
|
||||
]
|
||||
|
||||
def test_model(model_name):
|
||||
"""測試單個模型"""
|
||||
print(f"\n[測試模型: {model_name}]")
|
||||
print("-" * 40)
|
||||
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
# 測試聊天完成端點
|
||||
chat_url = f"{BASE_URL}/chat/completions"
|
||||
data = {
|
||||
"model": model_name,
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "Say 'Hello, I am working!' if you can see this message."}
|
||||
],
|
||||
"temperature": 0.5,
|
||||
"max_tokens": 50
|
||||
}
|
||||
|
||||
try:
|
||||
print(f"連接到: {chat_url}")
|
||||
response = requests.post(chat_url, headers=headers, json=data, timeout=30)
|
||||
|
||||
print(f"HTTP 狀態碼: {response.status_code}")
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
if 'choices' in result and len(result['choices']) > 0:
|
||||
content = result['choices'][0]['message']['content']
|
||||
print(f"[SUCCESS] AI 回應: {content}")
|
||||
return True
|
||||
else:
|
||||
print("[ERROR] 回應格式異常")
|
||||
print(f"回應內容: {json.dumps(result, indent=2)}")
|
||||
else:
|
||||
print(f"[ERROR] 錯誤回應")
|
||||
# 檢查是否是 HTML 錯誤頁面
|
||||
if response.text.startswith('<!DOCTYPE'):
|
||||
print("收到 HTML 錯誤頁面 (可能是 502 Bad Gateway)")
|
||||
else:
|
||||
print(f"回應內容: {response.text[:300]}")
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
print("[TIMEOUT] 請求超時 (30秒)")
|
||||
except requests.exceptions.ConnectionError as e:
|
||||
print(f"[CONNECTION ERROR]: {str(e)[:100]}")
|
||||
except Exception as e:
|
||||
print(f"[UNEXPECTED ERROR]: {str(e)[:100]}")
|
||||
|
||||
return False
|
||||
|
||||
def test_api_endpoints():
|
||||
"""測試不同的 API 端點"""
|
||||
print("\n[測試 API 端點可用性]")
|
||||
print("=" * 50)
|
||||
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
# 測試不同的可能端點
|
||||
endpoints = [
|
||||
f"{BASE_URL}/models",
|
||||
f"{BASE_URL}/chat/completions",
|
||||
BASE_URL
|
||||
]
|
||||
|
||||
for endpoint in endpoints:
|
||||
try:
|
||||
print(f"\n測試端點: {endpoint}")
|
||||
response = requests.get(endpoint, headers=headers, timeout=10)
|
||||
print(f" 狀態碼: {response.status_code}")
|
||||
|
||||
if response.status_code == 200:
|
||||
print(" [OK] 端點可訪問")
|
||||
# 如果是 JSON 回應,顯示部分內容
|
||||
try:
|
||||
data = response.json()
|
||||
print(f" 回應類型: JSON")
|
||||
if 'data' in data:
|
||||
print(f" 包含 {len(data['data'])} 項資料")
|
||||
except:
|
||||
print(f" 回應類型: {response.headers.get('content-type', 'unknown')}")
|
||||
elif response.status_code == 405:
|
||||
print(" [OK] 端點存在 (但不支援 GET 方法)")
|
||||
elif response.status_code == 502:
|
||||
print(" [ERROR] 502 Bad Gateway - 伺服器暫時無法使用")
|
||||
else:
|
||||
print(f" [ERROR] 無法訪問")
|
||||
|
||||
except Exception as e:
|
||||
print(f" [ERROR]: {str(e)[:50]}")
|
||||
|
||||
def main():
|
||||
print("=" * 50)
|
||||
print("Llama API 完整測試程式")
|
||||
print("=" * 50)
|
||||
print(f"API 基礎 URL: {BASE_URL}")
|
||||
print(f"API 金鑰: {API_KEY[:10]}...{API_KEY[-5:]}")
|
||||
|
||||
# 首先測試端點可用性
|
||||
test_api_endpoints()
|
||||
|
||||
print("\n" + "=" * 50)
|
||||
print("開始測試各個模型")
|
||||
print("=" * 50)
|
||||
|
||||
success_count = 0
|
||||
for model in MODELS:
|
||||
if test_model(model):
|
||||
success_count += 1
|
||||
time.sleep(1) # 避免請求過快
|
||||
|
||||
print("\n" + "=" * 50)
|
||||
print(f"測試結果: {success_count}/{len(MODELS)} 個模型成功連接")
|
||||
|
||||
if success_count == 0:
|
||||
print("\n可能的問題:")
|
||||
print("1. API 伺服器暫時離線 (502 錯誤)")
|
||||
print("2. API 金鑰可能不正確")
|
||||
print("3. 網路連接問題")
|
||||
print("4. 防火牆或代理設定")
|
||||
print("\n建議稍後再試,或聯繫 API 提供者確認服務狀態。")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
111
test_with_timeout.py
Normal file
111
test_with_timeout.py
Normal file
@@ -0,0 +1,111 @@
|
||||
import requests
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
# API 配置
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "https://llama.theaken.com/v1"
|
||||
|
||||
def test_endpoints():
|
||||
"""測試不同的 API 端點和模型"""
|
||||
|
||||
print("="*60)
|
||||
print(f"Llama API 測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
print("="*60)
|
||||
|
||||
headers = {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
# 測試配置
|
||||
tests = [
|
||||
{
|
||||
"name": "GPT-OSS-120B",
|
||||
"model": "gpt-oss-120b",
|
||||
"prompt": "Say hello in one word"
|
||||
},
|
||||
{
|
||||
"name": "DeepSeek-R1-671B",
|
||||
"model": "deepseek-r1-671b",
|
||||
"prompt": "Say hello in one word"
|
||||
},
|
||||
{
|
||||
"name": "Qwen3-Embedding-8B",
|
||||
"model": "qwen3-embedding-8b",
|
||||
"prompt": "Say hello in one word"
|
||||
}
|
||||
]
|
||||
|
||||
success_count = 0
|
||||
|
||||
for test in tests:
|
||||
print(f"\n[測試 {test['name']}]")
|
||||
print("-"*40)
|
||||
|
||||
data = {
|
||||
"model": test["model"],
|
||||
"messages": [
|
||||
{"role": "user", "content": test["prompt"]}
|
||||
],
|
||||
"temperature": 0.5,
|
||||
"max_tokens": 20
|
||||
}
|
||||
|
||||
try:
|
||||
# 使用較短的超時時間
|
||||
response = requests.post(
|
||||
f"{BASE_URL}/chat/completions",
|
||||
headers=headers,
|
||||
json=data,
|
||||
timeout=15
|
||||
)
|
||||
|
||||
print(f"HTTP 狀態: {response.status_code}")
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
if 'choices' in result:
|
||||
content = result['choices'][0]['message']['content']
|
||||
print(f"[SUCCESS] 成功回應: {content}")
|
||||
success_count += 1
|
||||
else:
|
||||
print("[ERROR] 回應格式錯誤")
|
||||
elif response.status_code == 502:
|
||||
print("[ERROR] 502 Bad Gateway - 伺服器無法回應")
|
||||
elif response.status_code == 401:
|
||||
print("[ERROR] 401 Unauthorized - API 金鑰可能錯誤")
|
||||
elif response.status_code == 404:
|
||||
print("[ERROR] 404 Not Found - 模型或端點不存在")
|
||||
else:
|
||||
print(f"[ERROR] 錯誤 {response.status_code}")
|
||||
if not response.text.startswith('<!DOCTYPE'):
|
||||
print(f"詳情: {response.text[:200]}")
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
print("[TIMEOUT] 請求超時 (15秒)")
|
||||
except requests.exceptions.ConnectionError as e:
|
||||
print(f"[CONNECTION ERROR] 無法連接到伺服器")
|
||||
except Exception as e:
|
||||
print(f"[UNKNOWN ERROR]: {str(e)[:100]}")
|
||||
|
||||
# 總結
|
||||
print("\n" + "="*60)
|
||||
print(f"測試結果: {success_count}/{len(tests)} 成功")
|
||||
|
||||
if success_count == 0:
|
||||
print("\n診斷資訊:")
|
||||
print("• 網路連接: 正常 (可 ping 通)")
|
||||
print("• API 端點: https://llama.theaken.com/v1")
|
||||
print("• 錯誤類型: 502 Bad Gateway")
|
||||
print("• 可能原因: 後端 API 服務暫時離線")
|
||||
print("\n建議行動:")
|
||||
print("1. 稍後再試 (建議 10-30 分鐘後)")
|
||||
print("2. 聯繫 API 管理員確認服務狀態")
|
||||
print("3. 檢查是否有服務維護公告")
|
||||
else:
|
||||
print(f"\n[OK] API 服務正常運作中!")
|
||||
print(f"[OK] 可使用的模型數: {success_count}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
test_endpoints()
|
33
使用說明.txt
Normal file
33
使用說明.txt
Normal file
@@ -0,0 +1,33 @@
|
||||
===========================================
|
||||
Llama 模型對話測試程式 - 使用說明
|
||||
===========================================
|
||||
|
||||
安裝步驟:
|
||||
---------
|
||||
1. 確保已安裝 Python 3.7 或更高版本
|
||||
|
||||
2. 安裝依賴套件:
|
||||
pip install -r requirements.txt
|
||||
|
||||
執行程式:
|
||||
---------
|
||||
python llama_test.py
|
||||
|
||||
功能說明:
|
||||
---------
|
||||
1. 程式啟動後會自動測試 API 連接
|
||||
2. 選擇要使用的模型 (1-3)
|
||||
3. 開始與 AI 進行對話
|
||||
4. 輸入 'exit' 或 'quit' 結束對話
|
||||
|
||||
可用模型:
|
||||
---------
|
||||
1. gpt-oss-120b (預設)
|
||||
2. deepseek-r1-671b
|
||||
3. qwen3-embedding-8b
|
||||
|
||||
注意事項:
|
||||
---------
|
||||
- 確保網路連接正常
|
||||
- API 金鑰已內建於程式中
|
||||
- 如遇到錯誤,請檢查網路連接或聯繫管理員
|
181
操作指南.md
Normal file
181
操作指南.md
Normal file
@@ -0,0 +1,181 @@
|
||||
# Llama API 連接操作指南
|
||||
|
||||
## 一、API 連接資訊
|
||||
|
||||
### API 金鑰
|
||||
```
|
||||
paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=
|
||||
```
|
||||
|
||||
### 可用端點
|
||||
|
||||
#### 內網端點(已測試成功)
|
||||
| 端點名稱 | URL | 狀態 | 支援模型 |
|
||||
|---------|-----|------|---------|
|
||||
| 內網端點 1 | http://192.168.0.6:21180/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
|
||||
| 內網端點 2 | http://192.168.0.6:21181/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
|
||||
| 內網端點 3 | http://192.168.0.6:21182/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
|
||||
| 內網端點 4 | http://192.168.0.6:21183/v1 | ❌ 錯誤 | 500 Internal Server Error |
|
||||
|
||||
#### 外網端點(待測試)
|
||||
| 端點名稱 | URL | 狀態 | 支援模型 |
|
||||
|---------|-----|------|---------|
|
||||
| GPT-OSS 專用 | https://llama.theaken.com/v1/gpt-oss-120b | 待測試 | gpt-oss-120b |
|
||||
| DeepSeek 專用 | https://llama.theaken.com/v1/deepseek-r1-671b | 待測試 | deepseek-r1-671b |
|
||||
| 通用端點 | https://llama.theaken.com/v1 | 待測試 | 所有模型 |
|
||||
|
||||
## 二、快速開始
|
||||
|
||||
### 1. 安裝依賴
|
||||
```bash
|
||||
pip install openai
|
||||
```
|
||||
|
||||
### 2. 測試連接(Python)
|
||||
|
||||
#### 內網連接範例
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
# 設定 API
|
||||
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
||||
BASE_URL = "http://192.168.0.6:21180/v1" # 使用內網端點 1
|
||||
|
||||
# 創建客戶端
|
||||
client = OpenAI(
|
||||
api_key=API_KEY,
|
||||
base_url=BASE_URL
|
||||
)
|
||||
|
||||
# 發送請求
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-oss-120b",
|
||||
messages=[
|
||||
{"role": "user", "content": "你好,請自我介紹"}
|
||||
],
|
||||
temperature=0.7,
|
||||
max_tokens=200
|
||||
)
|
||||
|
||||
# 顯示回應
|
||||
print(response.choices[0].message.content)
|
||||
```
|
||||
|
||||
## 三、使用現成程式
|
||||
|
||||
### 程式清單
|
||||
1. **llama_full_api.py** - 完整對話程式(支援內外網)
|
||||
2. **llama_chat.py** - 內網專用對話程式
|
||||
3. **local_api_test.py** - 端點測試工具
|
||||
4. **quick_test.py** - 快速測試腳本
|
||||
|
||||
### 執行對話程式
|
||||
```bash
|
||||
# 執行完整版(自動測試所有端點)
|
||||
python llama_full_api.py
|
||||
|
||||
# 執行內網版
|
||||
python llama_chat.py
|
||||
|
||||
# 快速測試
|
||||
python quick_test.py
|
||||
```
|
||||
|
||||
## 四、對話程式使用說明
|
||||
|
||||
### 基本操作
|
||||
1. 執行程式後會自動測試可用端點
|
||||
2. 選擇要使用的端點(輸入數字)
|
||||
3. 選擇要使用的模型
|
||||
4. 開始對話
|
||||
|
||||
### 對話中指令
|
||||
- `exit` 或 `quit` - 結束對話
|
||||
- `clear` - 清空對話歷史
|
||||
- `model` - 切換模型
|
||||
|
||||
## 五、常見問題處理
|
||||
|
||||
### 問題 1:502 Bad Gateway
|
||||
**原因**:外網 API 伺服器離線
|
||||
**解決**:使用內網端點
|
||||
|
||||
### 問題 2:Connection Error
|
||||
**原因**:不在內網環境或 IP 錯誤
|
||||
**解決**:
|
||||
1. 確認在同一網路環境
|
||||
2. 檢查防火牆設定
|
||||
3. ping 192.168.0.6 測試連通性
|
||||
|
||||
### 問題 3:編碼錯誤
|
||||
**原因**:Windows 終端編碼問題
|
||||
**解決**:使用英文對話或修改終端編碼
|
||||
|
||||
### 問題 4:回應包含特殊標記
|
||||
**說明**:如 `<think>`, `<|channel|>` 等
|
||||
**處理**:程式已自動過濾這些標記
|
||||
|
||||
## 六、API 回應格式清理
|
||||
|
||||
部分模型回應可能包含思考過程標記,程式會自動清理:
|
||||
- `<think>...</think>` - 思考過程
|
||||
- `<|channel|>...<|message|>` - 通道標記
|
||||
- `<|end|>`, `<|start|>` - 結束/開始標記
|
||||
|
||||
## 七、測試結果摘要
|
||||
|
||||
### 成功測試
|
||||
✅ 內網端點 1-3 全部正常運作
|
||||
✅ 支援 OpenAI SDK 標準格式
|
||||
✅ 可正常進行對話
|
||||
|
||||
### 待確認
|
||||
- 外網端點需等待伺服器恢復
|
||||
- DeepSeek 和 Qwen 模型需進一步測試
|
||||
|
||||
## 八、技術細節
|
||||
|
||||
### 使用 OpenAI SDK
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
api_key="你的金鑰",
|
||||
base_url="API端點URL"
|
||||
)
|
||||
```
|
||||
|
||||
### 使用 requests 庫
|
||||
```python
|
||||
import requests
|
||||
|
||||
headers = {
|
||||
"Authorization": "Bearer 你的金鑰",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
data = {
|
||||
"model": "gpt-oss-120b",
|
||||
"messages": [{"role": "user", "content": "你好"}],
|
||||
"temperature": 0.7,
|
||||
"max_tokens": 200
|
||||
}
|
||||
|
||||
response = requests.post(
|
||||
"API端點URL/chat/completions",
|
||||
headers=headers,
|
||||
json=data
|
||||
)
|
||||
```
|
||||
|
||||
## 九、建議使用方式
|
||||
|
||||
1. **開發測試**:使用內網端點(速度快、穩定)
|
||||
2. **生產環境**:配置多個端點自動切換
|
||||
3. **對話應用**:使用 llama_full_api.py
|
||||
4. **API 整合**:參考 quick_test.py 的實現
|
||||
|
||||
---
|
||||
|
||||
最後更新:2025-09-19
|
||||
測試環境:Windows / Python 3.13
|
14
連線參數.txt
Normal file
14
連線參數.txt
Normal file
@@ -0,0 +1,14 @@
|
||||
可以連接 llama 的模型,ai進行對話
|
||||
他的連線資料如下:
|
||||
|
||||
外網連線:
|
||||
https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
|
||||
https://llama.theaken.com/v1https://llama.theaken.com/v1/deepseek-r1-671b/
|
||||
https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
|
||||
外網模型路徑:
|
||||
1. /gpt-oss-120b/
|
||||
2. /deepseek-r1-671b/
|
||||
3. /qwen3-embedding-8b/
|
||||
|
||||
|
||||
金鑰:paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=
|
Reference in New Issue
Block a user