- Added complete Python client for Llama AI models - Support for internal network endpoints (tested and working) - Support for external network endpoints (configured) - Interactive chat interface with multiple models - Automatic endpoint testing and failover - Response cleaning for special markers - Full documentation in English and Chinese - Complete test suite and examples - MIT License and contribution guidelines
201 lines
5.3 KiB
Markdown
201 lines
5.3 KiB
Markdown
# Llama API Client
|
|
|
|
A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.
|
|
|
|
## Features
|
|
|
|
- 🌐 Support for both internal network and external API endpoints
|
|
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
|
|
- 💬 Interactive chat interface with conversation history
|
|
- 🔄 Automatic endpoint testing and failover
|
|
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
|
|
- 📝 Full conversation context management
|
|
|
|
## Quick Start
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://github.com/yourusername/llama-api-client.git
|
|
cd llama-api-client
|
|
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### Basic Usage
|
|
|
|
```python
|
|
from openai import OpenAI
|
|
|
|
# Configure API
|
|
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
|
BASE_URL = "http://192.168.0.6:21180/v1"
|
|
|
|
# Create client
|
|
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
|
|
|
|
# Send request
|
|
response = client.chat.completions.create(
|
|
model="gpt-oss-120b",
|
|
messages=[{"role": "user", "content": "Hello!"}],
|
|
temperature=0.7,
|
|
max_tokens=200
|
|
)
|
|
|
|
print(response.choices[0].message.content)
|
|
```
|
|
|
|
### Run Interactive Chat
|
|
|
|
```bash
|
|
# Full-featured chat with all endpoints
|
|
python llama_full_api.py
|
|
|
|
# Internal network only
|
|
python llama_chat.py
|
|
|
|
# Quick test
|
|
python quick_test.py
|
|
```
|
|
|
|
## Available Endpoints
|
|
|
|
### Internal Network (Tested & Working ✅)
|
|
|
|
| Endpoint | URL | Status |
|
|
|----------|-----|--------|
|
|
| Internal 1 | `http://192.168.0.6:21180/v1` | ✅ Working |
|
|
| Internal 2 | `http://192.168.0.6:21181/v1` | ✅ Working |
|
|
| Internal 3 | `http://192.168.0.6:21182/v1` | ✅ Working |
|
|
| Internal 4 | `http://192.168.0.6:21183/v1` | ❌ Error 500 |
|
|
|
|
### External Network
|
|
|
|
| Endpoint | URL | Status |
|
|
|----------|-----|--------|
|
|
| GPT-OSS | `https://llama.theaken.com/v1/gpt-oss-120b` | 🔄 Pending |
|
|
| DeepSeek | `https://llama.theaken.com/v1/deepseek-r1-671b` | 🔄 Pending |
|
|
| General | `https://llama.theaken.com/v1` | 🔄 Pending |
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
llama-api-client/
|
|
├── README.md # This file
|
|
├── requirements.txt # Python dependencies
|
|
├── 操作指南.md # Chinese operation guide
|
|
├── llama_full_api.py # Full-featured chat client
|
|
├── llama_chat.py # Internal network chat client
|
|
├── local_api_test.py # Endpoint testing tool
|
|
├── quick_test.py # Quick connection test
|
|
├── test_all_models.py # Model testing script
|
|
└── demo_chat.py # Demo chat with fallback
|
|
```
|
|
|
|
## Chat Commands
|
|
|
|
During chat sessions, you can use these commands:
|
|
|
|
- `exit` or `quit` - End the conversation
|
|
- `clear` - Clear conversation history
|
|
- `model` - Switch between available models
|
|
|
|
## Configuration
|
|
|
|
### API Key
|
|
```python
|
|
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
|
|
```
|
|
|
|
### Available Models
|
|
- `gpt-oss-120b` - GPT Open Source 120B parameters
|
|
- `deepseek-r1-671b` - DeepSeek R1 671B parameters
|
|
- `qwen3-embedding-8b` - Qwen3 Embedding 8B parameters
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: 502 Bad Gateway
|
|
**Cause**: External API server is offline
|
|
**Solution**: Use internal network endpoints
|
|
|
|
### Issue: Connection Error
|
|
**Cause**: Not on internal network or incorrect IP
|
|
**Solution**:
|
|
1. Verify network connectivity: `ping 192.168.0.6`
|
|
2. Check firewall settings
|
|
3. Ensure you're on the same network
|
|
|
|
### Issue: Encoding Error
|
|
**Cause**: Windows terminal encoding issues
|
|
**Solution**: Use English for conversations or modify terminal encoding
|
|
|
|
### Issue: Response Contains Special Markers
|
|
**Description**: Responses may contain `<think>`, `<|channel|>` tags
|
|
**Solution**: The client automatically removes these markers
|
|
|
|
## Response Cleaning
|
|
|
|
The client automatically removes these special markers from AI responses:
|
|
- `<think>...</think>` - Thinking process
|
|
- `<|channel|>...<|message|>` - Channel markers
|
|
- `<|end|>`, `<|start|>` - End/start markers
|
|
|
|
## Requirements
|
|
|
|
- Python 3.7+
|
|
- openai>=1.0.0
|
|
- requests (optional, for direct API calls)
|
|
|
|
## Development
|
|
|
|
### Testing Connection
|
|
```python
|
|
python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
|
|
```
|
|
|
|
### Adding New Endpoints
|
|
Edit `ENDPOINTS` dictionary in `llama_full_api.py`:
|
|
```python
|
|
ENDPOINTS = {
|
|
"internal": [
|
|
{
|
|
"name": "New Endpoint",
|
|
"url": "http://new-endpoint/v1",
|
|
"models": ["gpt-oss-120b"]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## License
|
|
|
|
MIT License - See LICENSE file for details
|
|
|
|
## Contributing
|
|
|
|
1. Fork the repository
|
|
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
|
|
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
|
4. Push to the branch (`git push origin feature/amazing-feature`)
|
|
5. Open a Pull Request
|
|
|
|
## Support
|
|
|
|
For issues or questions:
|
|
1. Check the [操作指南.md](操作指南.md) for detailed Chinese documentation
|
|
2. Open an issue on GitHub
|
|
3. Contact the API administrator for server-related issues
|
|
|
|
## Acknowledgments
|
|
|
|
- Built with OpenAI Python SDK
|
|
- Compatible with OpenAI API format
|
|
- Supports multiple Llama model variants
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-09-19
|
|
**Version**: 1.0.0
|
|
**Status**: Internal endpoints working, external endpoints pending |