Files
pj_llama/README.md
aken1023 c6cc91da7f Initial commit: Llama API Client with full documentation
- Added complete Python client for Llama AI models
- Support for internal network endpoints (tested and working)
- Support for external network endpoints (configured)
- Interactive chat interface with multiple models
- Automatic endpoint testing and failover
- Response cleaning for special markers
- Full documentation in English and Chinese
- Complete test suite and examples
- MIT License and contribution guidelines
2025-09-19 21:38:15 +08:00

5.3 KiB

Llama API Client

A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.

Features

  • 🌐 Support for both internal network and external API endpoints
  • 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
  • 💬 Interactive chat interface with conversation history
  • 🔄 Automatic endpoint testing and failover
  • 🧹 Automatic response cleaning (removes thinking tags and special markers)
  • 📝 Full conversation context management

Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client

# Install dependencies
pip install -r requirements.txt

Basic Usage

from openai import OpenAI

# Configure API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1"

# Create client
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)

# Send request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0.7,
    max_tokens=200
)

print(response.choices[0].message.content)

Run Interactive Chat

# Full-featured chat with all endpoints
python llama_full_api.py

# Internal network only
python llama_chat.py

# Quick test
python quick_test.py

Available Endpoints

Internal Network (Tested & Working )

Endpoint URL Status
Internal 1 http://192.168.0.6:21180/v1 Working
Internal 2 http://192.168.0.6:21181/v1 Working
Internal 3 http://192.168.0.6:21182/v1 Working
Internal 4 http://192.168.0.6:21183/v1 Error 500

External Network

Endpoint URL Status
GPT-OSS https://llama.theaken.com/v1/gpt-oss-120b 🔄 Pending
DeepSeek https://llama.theaken.com/v1/deepseek-r1-671b 🔄 Pending
General https://llama.theaken.com/v1 🔄 Pending

Project Structure

llama-api-client/
├── README.md                 # This file
├── requirements.txt          # Python dependencies
├── 操作指南.md               # Chinese operation guide
├── llama_full_api.py        # Full-featured chat client
├── llama_chat.py            # Internal network chat client
├── local_api_test.py        # Endpoint testing tool
├── quick_test.py            # Quick connection test
├── test_all_models.py       # Model testing script
└── demo_chat.py             # Demo chat with fallback

Chat Commands

During chat sessions, you can use these commands:

  • exit or quit - End the conversation
  • clear - Clear conversation history
  • model - Switch between available models

Configuration

API Key

API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="

Available Models

  • gpt-oss-120b - GPT Open Source 120B parameters
  • deepseek-r1-671b - DeepSeek R1 671B parameters
  • qwen3-embedding-8b - Qwen3 Embedding 8B parameters

Troubleshooting

Issue: 502 Bad Gateway

Cause: External API server is offline
Solution: Use internal network endpoints

Issue: Connection Error

Cause: Not on internal network or incorrect IP
Solution:

  1. Verify network connectivity: ping 192.168.0.6
  2. Check firewall settings
  3. Ensure you're on the same network

Issue: Encoding Error

Cause: Windows terminal encoding issues
Solution: Use English for conversations or modify terminal encoding

Issue: Response Contains Special Markers

Description: Responses may contain <think>, <|channel|> tags
Solution: The client automatically removes these markers

Response Cleaning

The client automatically removes these special markers from AI responses:

  • <think>...</think> - Thinking process
  • <|channel|>...<|message|> - Channel markers
  • <|end|>, <|start|> - End/start markers

Requirements

  • Python 3.7+
  • openai>=1.0.0
  • requests (optional, for direct API calls)

Development

Testing Connection

python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"

Adding New Endpoints

Edit ENDPOINTS dictionary in llama_full_api.py:

ENDPOINTS = {
    "internal": [
        {
            "name": "New Endpoint",
            "url": "http://new-endpoint/v1",
            "models": ["gpt-oss-120b"]
        }
    ]
}

License

MIT License - See LICENSE file for details

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Support

For issues or questions:

  1. Check the 操作指南.md for detailed Chinese documentation
  2. Open an issue on GitHub
  3. Contact the API administrator for server-related issues

Acknowledgments

  • Built with OpenAI Python SDK
  • Compatible with OpenAI API format
  • Supports multiple Llama model variants

Last Updated: 2025-09-19
Version: 1.0.0
Status: Internal endpoints working, external endpoints pending