Files

aken1023 8a929936ad Initial commit with Llama API client and docs

Add Python scripts for Llama API chat clients, endpoint testing, and quick tests. Include documentation (README, CONTRIBUTING, 操作指南), license, and .gitignore. Supports multiple endpoints and models for OpenAI-compatible Llama API usage.

2025-09-19 21:44:02 +08:00

5.3 KiB

Raw Blame History

Llama API Client

A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.

Features

🌐 Support for both internal network and external API endpoints
🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
💬 Interactive chat interface with conversation history
🔄 Automatic endpoint testing and failover
🧹 Automatic response cleaning (removes thinking tags and special markers)
📝 Full conversation context management

Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/llama-api-client.git
cd llama-api-client

# Install dependencies
pip install -r requirements.txt

Basic Usage

from openai import OpenAI

# Configure API
API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
BASE_URL = "http://192.168.0.6:21180/v1"

# Create client
client = OpenAI(api_key=API_KEY, base_url=BASE_URL)

# Send request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0.7,
    max_tokens=200
)

print(response.choices[0].message.content)

Run Interactive Chat

# Full-featured chat with all endpoints
python llama_full_api.py

# Internal network only
python llama_chat.py

# Quick test
python quick_test.py

Available Endpoints

Internal Network (Tested & Working ✅)

Endpoint	URL	Status
Internal 1	`http://192.168.0.6:21180/v1`	✅ Working
Internal 2	`http://192.168.0.6:21181/v1`	✅ Working
Internal 3	`http://192.168.0.6:21182/v1`	✅ Working
Internal 4	`http://192.168.0.6:21183/v1`	❌ Error 500

External Network

Endpoint	URL	Status
GPT-OSS	`https://llama.theaken.com/v1/gpt-oss-120b`	🔄 Pending
DeepSeek	`https://llama.theaken.com/v1/deepseek-r1-671b`	🔄 Pending
General	`https://llama.theaken.com/v1`	🔄 Pending

Project Structure

llama-api-client/
├── README.md                 # This file
├── requirements.txt          # Python dependencies
├── 操作指南.md               # Chinese operation guide
├── llama_full_api.py        # Full-featured chat client
├── llama_chat.py            # Internal network chat client
├── local_api_test.py        # Endpoint testing tool
├── quick_test.py            # Quick connection test
├── test_all_models.py       # Model testing script
└── demo_chat.py             # Demo chat with fallback

Chat Commands

During chat sessions, you can use these commands:

exit or quit - End the conversation
clear - Clear conversation history
model - Switch between available models

Configuration

API Key

API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="

Available Models

gpt-oss-120b - GPT Open Source 120B parameters
deepseek-r1-671b - DeepSeek R1 671B parameters
qwen3-embedding-8b - Qwen3 Embedding 8B parameters

Troubleshooting

Issue: 502 Bad Gateway

Cause: External API server is offline
Solution: Use internal network endpoints

Issue: Connection Error

Cause: Not on internal network or incorrect IP
Solution:

Verify network connectivity: ping 192.168.0.6
Check firewall settings
Ensure you're on the same network

Issue: Encoding Error

Cause: Windows terminal encoding issues
Solution: Use English for conversations or modify terminal encoding

Issue: Response Contains Special Markers

Description: Responses may contain <think>, <|channel|> tags
Solution: The client automatically removes these markers

Response Cleaning

The client automatically removes these special markers from AI responses:

<think>...</think> - Thinking process
<|channel|>...<|message|> - Channel markers
<|end|>, <|start|> - End/start markers

Requirements

Python 3.7+
openai>=1.0.0
requests (optional, for direct API calls)

Development

Testing Connection

python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"

Adding New Endpoints

Edit ENDPOINTS dictionary in llama_full_api.py:

ENDPOINTS = {
    "internal": [
        {
            "name": "New Endpoint",
            "url": "http://new-endpoint/v1",
            "models": ["gpt-oss-120b"]
        }
    ]
}

License

MIT License - See LICENSE file for details

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Support

For issues or questions:

Check the 操作指南.md for detailed Chinese documentation
Open an issue on GitHub
Contact the API administrator for server-related issues

Acknowledgments

Built with OpenAI Python SDK
Compatible with OpenAI API format
Supports multiple Llama model variants

Last Updated: 2025-09-19
Version: 1.0.0
Status: Internal endpoints working, external endpoints pending

5.3 KiB Raw Blame History