# Llama API Client

A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.

## Features

- 🌐 Support for both internal network and external API endpoints
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
- 💬 Interactive chat interface with conversation history
- 🔄 Automatic endpoint testing and failover
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
- 📝 Full conversation context management

## Quick Start

### Installation

```bash
# Clone the repository
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client

# Install dependencies
pip install -r requirements.txt
```

### Basic Usage

```python
from openai import OpenAI

# Configure API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1"

# Create client
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)

# Send request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0.7,
    max_tokens=200
)

print(response.choices[0].message.content)
```

### Run Interactive Chat

```bash
# Full-featured chat with all endpoints
python llama_full_api.py

# Internal network only
python llama_chat.py

# Quick test
python quick_test.py
```

## Available Endpoints

### Internal Network (Tested & Working ✅)

| Endpoint | URL | Status |
|----------|-----|--------|
| Internal 1 | `http://192.168.0.6:21180/v1` | ✅ Working |
| Internal 2 | `http://192.168.0.6:21181/v1` | ✅ Working |
| Internal 3 | `http://192.168.0.6:21182/v1` | ✅ Working |
| Internal 4 | `http://192.168.0.6:21183/v1` | ❌ Error 500 |

### External Network

| Endpoint | URL | Status |
|----------|-----|--------|
| GPT-OSS | `https://llama.theaken.com/v1/gpt-oss-120b` | 🔄 Pending |
| DeepSeek | `https://llama.theaken.com/v1/deepseek-r1-671b` | 🔄 Pending |
| General | `https://llama.theaken.com/v1` | 🔄 Pending |

## Project Structure

```
llama-api-client/
├── README.md                 # This file
├── requirements.txt          # Python dependencies
├── 操作指南.md               # Chinese operation guide
├── llama_full_api.py        # Full-featured chat client
├── llama_chat.py            # Internal network chat client
├── local_api_test.py        # Endpoint testing tool
├── quick_test.py            # Quick connection test
├── test_all_models.py       # Model testing script
└── demo_chat.py             # Demo chat with fallback
```

## Chat Commands

During chat sessions, you can use these commands:

- `exit` or `quit` - End the conversation
- `clear` - Clear conversation history
- `model` - Switch between available models

## Configuration

### API Key
```python
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
```

### Available Models
- `gpt-oss-120b` - GPT Open Source 120B parameters
- `deepseek-r1-671b` - DeepSeek R1 671B parameters
- `qwen3-embedding-8b` - Qwen3 Embedding 8B parameters

## Troubleshooting

### Issue: 502 Bad Gateway
**Cause**: External API server is offline  
**Solution**: Use internal network endpoints

### Issue: Connection Error
**Cause**: Not on internal network or incorrect IP  
**Solution**: 
1. Verify network connectivity: `ping 192.168.0.6`
2. Check firewall settings
3. Ensure you're on the same network

### Issue: Encoding Error
**Cause**: Windows terminal encoding issues  
**Solution**: Use English for conversations or modify terminal encoding

### Issue: Response Contains Special Markers
**Description**: Responses may contain `<think>`, `<|channel|>` tags  
**Solution**: The client automatically removes these markers

## Response Cleaning

The client automatically removes these special markers from AI responses:
- `<think>...</think>` - Thinking process
- `<|channel|>...<|message|>` - Channel markers
- `<|end|>`, `<|start|>` - End/start markers

## Requirements

- Python 3.7+
- openai>=1.0.0
- requests (optional, for direct API calls)

## Development

### Testing Connection
```python
python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
```

### Adding New Endpoints
Edit `ENDPOINTS` dictionary in `llama_full_api.py`:
```python
ENDPOINTS = {
    "internal": [
        {
            "name": "New Endpoint",
            "url": "http://new-endpoint/v1",
            "models": ["gpt-oss-120b"]
        }
    ]
}
```

## License

MIT License - See LICENSE file for details

## Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## Support

For issues or questions:
1. Check the [操作指南.md](操作指南.md) for detailed Chinese documentation
2. Open an issue on GitHub
3. Contact the API administrator for server-related issues

## Acknowledgments

- Built with OpenAI Python SDK
- Compatible with OpenAI API format
- Supports multiple Llama model variants

---

**Last Updated**: 2025-09-19  
**Version**: 1.0.0  
**Status**: Internal endpoints working, external endpoints pending