Add Python scripts for Llama API chat clients, endpoint testing, and quick tests. Include documentation (README, CONTRIBUTING, 操作指南), license, and .gitignore. Supports multiple endpoints and models for OpenAI-compatible Llama API usage.
5.3 KiB
Llama API Client
A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.
Features
- 🌐 Support for both internal network and external API endpoints
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
- 💬 Interactive chat interface with conversation history
- 🔄 Automatic endpoint testing and failover
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
- 📝 Full conversation context management
Quick Start
Installation
# Clone the repository
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client
# Install dependencies
pip install -r requirements.txt
Basic Usage
from openai import OpenAI
# Configure API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1"
# Create client
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
# Send request
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[{"role": "user", "content": "Hello!"}],
temperature=0.7,
max_tokens=200
)
print(response.choices[0].message.content)
Run Interactive Chat
# Full-featured chat with all endpoints
python llama_full_api.py
# Internal network only
python llama_chat.py
# Quick test
python quick_test.py
Available Endpoints
Internal Network (Tested & Working ✅)
Endpoint | URL | Status |
---|---|---|
Internal 1 | http://192.168.0.6:21180/v1 |
✅ Working |
Internal 2 | http://192.168.0.6:21181/v1 |
✅ Working |
Internal 3 | http://192.168.0.6:21182/v1 |
✅ Working |
Internal 4 | http://192.168.0.6:21183/v1 |
❌ Error 500 |
External Network
Endpoint | URL | Status |
---|---|---|
GPT-OSS | https://llama.theaken.com/v1/gpt-oss-120b |
🔄 Pending |
DeepSeek | https://llama.theaken.com/v1/deepseek-r1-671b |
🔄 Pending |
General | https://llama.theaken.com/v1 |
🔄 Pending |
Project Structure
llama-api-client/
├── README.md # This file
├── requirements.txt # Python dependencies
├── 操作指南.md # Chinese operation guide
├── llama_full_api.py # Full-featured chat client
├── llama_chat.py # Internal network chat client
├── local_api_test.py # Endpoint testing tool
├── quick_test.py # Quick connection test
├── test_all_models.py # Model testing script
└── demo_chat.py # Demo chat with fallback
Chat Commands
During chat sessions, you can use these commands:
exit
orquit
- End the conversationclear
- Clear conversation historymodel
- Switch between available models
Configuration
API Key
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
Available Models
gpt-oss-120b
- GPT Open Source 120B parametersdeepseek-r1-671b
- DeepSeek R1 671B parametersqwen3-embedding-8b
- Qwen3 Embedding 8B parameters
Troubleshooting
Issue: 502 Bad Gateway
Cause: External API server is offline
Solution: Use internal network endpoints
Issue: Connection Error
Cause: Not on internal network or incorrect IP
Solution:
- Verify network connectivity:
ping 192.168.0.6
- Check firewall settings
- Ensure you're on the same network
Issue: Encoding Error
Cause: Windows terminal encoding issues
Solution: Use English for conversations or modify terminal encoding
Issue: Response Contains Special Markers
Description: Responses may contain <think>
, <|channel|>
tags
Solution: The client automatically removes these markers
Response Cleaning
The client automatically removes these special markers from AI responses:
<think>...</think>
- Thinking process<|channel|>...<|message|>
- Channel markers<|end|>
,<|start|>
- End/start markers
Requirements
- Python 3.7+
- openai>=1.0.0
- requests (optional, for direct API calls)
Development
Testing Connection
python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
Adding New Endpoints
Edit ENDPOINTS
dictionary in llama_full_api.py
:
ENDPOINTS = {
"internal": [
{
"name": "New Endpoint",
"url": "http://new-endpoint/v1",
"models": ["gpt-oss-120b"]
}
]
}
License
MIT License - See LICENSE file for details
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Support
For issues or questions:
- Check the 操作指南.md for detailed Chinese documentation
- Open an issue on GitHub
- Contact the API administrator for server-related issues
Acknowledgments
- Built with OpenAI Python SDK
- Compatible with OpenAI API format
- Supports multiple Llama model variants
Last Updated: 2025-09-19
Version: 1.0.0
Status: Internal endpoints working, external endpoints pending