整合所有檔案到 pj_llama 目錄

- 遷移完整的專案到 pj_llama 目錄 - 包含所有最新檔案和 Git 歷史 - 移除不需要的舊檔案 - 統一開發和執行環境
新增內網外網選擇功能
2025-09-19 23:32:09 +08:00 · 2025-09-19 23:27:19 +08:00 · 2025-09-19 22:09:15 +08:00 · 2025-09-19 22:07:01 +08:00 · 2025-09-19 22:04:10 +08:00 · 2025-09-19 21:59:47 +08:00
14 changed files with 750 additions and 1026 deletions
--- a/.claude/settings.local.json
+++ b/.claude/settings.local.json
@@ -8,11 +8,7 @@
      "Bash(dir)",
      "Bash(git init:*)",
      "Bash(git add:*)",
-      "Bash(git commit:*)",
-      "Bash(git remote add:*)",
-      "Bash(git branch:*)",
-      "Bash(git push:*)",
-      "Bash(git pull:*)"
+      "Bash(git commit:*)"
    ],
    "defaultMode": "acceptEdits"
  }
--- a/README.md
+++ b/README.md
@@ -1,201 +1,219 @@
-# Llama API Client
+# Llama AI 對話程式

-A Python client for connecting to Llama AI models through OpenAI-compatible API endpoints.
+一個簡單易用的 Python 程式，用於連接和使用 Llama AI 模型進行對話。

-## Features
+## 🌟 主要功能

- 🌐 Support for both internal network and external API endpoints
- 🤖 Multiple model support (GPT-OSS-120B, DeepSeek-R1-671B, Qwen3-Embedding-8B)
- 💬 Interactive chat interface with conversation history
- 🔄 Automatic endpoint testing and failover
- 🧹 Automatic response cleaning (removes thinking tags and special markers)
- 📝 Full conversation context management
+- ✅ **支援多端點連接** - 自動偵測可用的 API 端點
+- 💬 **互動式對話介面** - 像聊天一樣與 AI 對話
+- 🤖 **多模型支援** - GPT-OSS、DeepSeek、Qwen 等多種模型
+- 🔄 **自動端點選擇** - 自動測試並選擇可用端點
+- 🧹 **智慧清理回應** - 自動移除 AI 思考過程的標記
+- 📝 **對話歷史管理** - 保持上下文連貫的對話

-## Quick Start
+## 🚀 快速開始

-### Installation
+### 1. 安裝需求
+
+確保你的電腦已安裝 Python 3.7 或更新版本。

 ```bash
-# Clone the repository
-git clone https://github.com/yourusername/llama-api-client.git
-cd llama-api-client
-
-# Install dependencies
-pip install -r requirements.txt
+# 安裝必要套件
+pip install openai
 ```

-### Basic Usage
+### 2. 下載程式
+
+```bash
+# 複製專案
+git clone https://gitea.theaken.com/aken1023/pj_llama.git
+cd pj_llama
+
+# 或直接下載 ZIP 檔案解壓縮
+```
+
+### 3. 執行對話程式
+
+```bash
+# 🌟 執行主程式（支援內外網選擇）
+python llama_universal.py
+
+# 或執行外網專用版本
+python llama_chat.py
+
+# 或執行完整版本（支援多端點）
+python llama_full_api.py
+```
+
+## 📖 使用說明
+
+### 基本對話
+
+執行程式後，會出現以下畫面：
+
+```
+============================================================
+選擇網路環境
+============================================================
+
+可用環境：
+  1. 內網環境
+     說明：公司/學校內部網路，使用內部 IP 地址
+     端點數量：3 個
+  2. 外網環境
+     說明：公開網際網路，使用外部域名
+     端點數量：3 個
+
+請選擇環境 (1-2)，預設為 1: 
+```
+
+選擇端點後即可開始對話：
+
+```
+你: 你好
+AI: 你好！有什麼我可以幫助你的嗎？
+
+你: 1+1等於多少？
+AI: 1+1等於2。
+```
+
+### 對話指令
+
+在對話中可以使用以下指令：
+
+| 指令 | 功能 |
+|-----|------|
+| `exit` 或 `quit` | 結束對話 |
+| `clear` | 清空對話歷史，開始新對話 |
+| `model` | 切換使用的 AI 模型 |
+| `switch` | 切換網路環境（內網/外網） |
+
+## 🔧 程式檔案說明
+
+| 檔案名稱 | 用途說明 |
+|---------|---------|
+| `llama_universal.py` | **🌟 主程式** - 支援內外網環境選擇 |
+| `llama_chat.py` | 外網專用對話程式 |
+| `llama_full_api.py` | 完整功能版本，支援多端點切換 |
+| `quick_test.py` | 快速測試連接是否正常 |
+| `test_all_models.py` | 測試所有模型的工具 |
+
+## 🌐 可用的 API 端點
+
+### 內網端點（公司/學校內部）
+
+| 端點 | 地址 | 支援模型 |
+|-----|------|---------|
+| 內網端點 1 | `http://192.168.0.6:21180/v1` | 所有模型 |
+| 內網端點 2 | `http://192.168.0.6:21181/v1` | 所有模型 |
+| 內網端點 3 | `http://192.168.0.6:21182/v1` | 所有模型 |
+
+### 外網端點（公開網際網路）
+
+| 端點 | 地址 | 支援模型 |
+|-----|------|---------|
+| 通用端點 | `https://llama.theaken.com/v1` | 所有模型 |
+| GPT-OSS 專用 | `https://llama.theaken.com/v1/gpt-oss-120b` | GPT-OSS-120B |
+| DeepSeek 專用 | `https://llama.theaken.com/v1/deepseek-r1-671b` | DeepSeek-R1-671B |
+
+## 🤖 支援的 AI 模型
+
+1. **GPT-OSS-120B** - 開源 GPT 模型，1200 億參數
+2. **DeepSeek-R1-671B** - DeepSeek 推理模型，6710 億參數
+3. **Qwen3-Embedding-8B** - 通義千問嵌入模型，80 億參數
+
+## ❓ 常見問題
+
+### 問題：程式顯示「無法連接」
+
+**解決方法：**
+1. 嘗試切換網路環境（使用 `switch` 指令）
+2. 內網環境：確認在公司/學校網路內
+3. 外網環境：確認可以訪問 https://llama.theaken.com
+4. 執行 `python quick_test.py` 測試連接
+
+### 問題：AI 回應包含奇怪的標記
+
+**說明：** 
+有時 AI 回應會包含 `<think>` 或 `<|channel|>` 等標記，這是 AI 的思考過程，程式會自動清理這些內容。
+
+### 問題：對話不連貫
+
+**解決方法：**
+使用 `clear` 指令清空對話歷史，開始新的對話。
+
+## 📝 簡單範例程式碼
+
+如果你想在自己的程式中使用，可以參考以下程式碼：

 ```python
 from openai import OpenAI

-# Configure API
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "http://192.168.0.6:21180/v1"
-
-# Create client
-client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
-
-# Send request
-response = client.chat.completions.create(
-    model="gpt-oss-120b",
-    messages=[{"role": "user", "content": "Hello!"}],
-    temperature=0.7,
-    max_tokens=200
+# 設定連接（外網）
+client = OpenAI(
+    api_key="paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=",
+    base_url="https://llama.theaken.com/v1"
 )

+# 或設定連接（內網）
+client = OpenAI(
+    api_key="paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=",
+    base_url="http://192.168.0.6:21180/v1"
+)
+
+# 發送訊息
+response = client.chat.completions.create(
+    model="gpt-oss-120b",
+    messages=[
+        {"role": "user", "content": "你好"}
+    ]
+)
+
+# 顯示回應
 print(response.choices[0].message.content)
 ```

-### Run Interactive Chat
+## 🛠️ 進階設定

-```bash
-# Full-featured chat with all endpoints
-python llama_full_api.py
+### 修改 API 金鑰

-# Internal network only
-python llama_chat.py
+如果需要使用不同的 API 金鑰，編輯程式中的：

-# Quick test
-python quick_test.py
-```
-
-## Available Endpoints
-
-### Internal Network (Tested & Working ✅)
-
-| Endpoint | URL | Status |
-|----------|-----|--------|
-| Internal 1 | `http://192.168.0.6:21180/v1` | ✅ Working |
-| Internal 2 | `http://192.168.0.6:21181/v1` | ✅ Working |
-| Internal 3 | `http://192.168.0.6:21182/v1` | ✅ Working |
-| Internal 4 | `http://192.168.0.6:21183/v1` | ❌ Error 500 |
-
-### External Network
-
-| Endpoint | URL | Status |
-|----------|-----|--------|
-| GPT-OSS | `https://llama.theaken.com/v1/gpt-oss-120b` | 🔄 Pending |
-| DeepSeek | `https://llama.theaken.com/v1/deepseek-r1-671b` | 🔄 Pending |
-| General | `https://llama.theaken.com/v1` | 🔄 Pending |
-
-## Project Structure
-
-```
-llama-api-client/
-├── README.md                 # This file
-├── requirements.txt          # Python dependencies
-├── 操作指南.md               # Chinese operation guide
-├── llama_full_api.py        # Full-featured chat client
-├── llama_chat.py            # Internal network chat client
-├── local_api_test.py        # Endpoint testing tool
-├── quick_test.py            # Quick connection test
-├── test_all_models.py       # Model testing script
-└── demo_chat.py             # Demo chat with fallback
-```
-
-## Chat Commands
-
-During chat sessions, you can use these commands:
-
- `exit` or `quit` - End the conversation
- `clear` - Clear conversation history
- `model` - Switch between available models
-
-## Configuration
-
-### API Key
 ```python
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
+API_KEY = "你的新金鑰"
 ```

-### Available Models
- `gpt-oss-120b` - GPT Open Source 120B parameters
- `deepseek-r1-671b` - DeepSeek R1 671B parameters
- `qwen3-embedding-8b` - Qwen3 Embedding 8B parameters
+### 新增端點

-## Troubleshooting
+在 `llama_full_api.py` 中的 `ENDPOINTS` 加入新端點：

-### Issue: 502 Bad Gateway
-**Cause**: External API server is offline  
-**Solution**: Use internal network endpoints
-
-### Issue: Connection Error
-**Cause**: Not on internal network or incorrect IP  
-**Solution**: 
-1. Verify network connectivity: `ping 192.168.0.6`
-2. Check firewall settings
-3. Ensure you're on the same network
-
-### Issue: Encoding Error
-**Cause**: Windows terminal encoding issues  
-**Solution**: Use English for conversations or modify terminal encoding
-
-### Issue: Response Contains Special Markers
-**Description**: Responses may contain `<think>`, `<|channel|>` tags  
-**Solution**: The client automatically removes these markers
-
-## Response Cleaning
-
-The client automatically removes these special markers from AI responses:
- `<think>...</think>` - Thinking process
- `<|channel|>...<|message|>` - Channel markers
- `<|end|>`, `<|start|>` - End/start markers
-
-## Requirements
-
- Python 3.7+
- openai>=1.0.0
- requests (optional, for direct API calls)
-
-## Development
-
-### Testing Connection
 ```python
-python -c "from openai import OpenAI; client = OpenAI(api_key='YOUR_KEY', base_url='YOUR_URL'); print(client.chat.completions.create(model='gpt-oss-120b', messages=[{'role': 'user', 'content': 'test'}], max_tokens=5).choices[0].message.content)"
-```
-
-### Adding New Endpoints
-Edit `ENDPOINTS` dictionary in `llama_full_api.py`:
-```python
-ENDPOINTS = {
-    "internal": [
+ENDPOINTS = [
    {
-            "name": "New Endpoint",
-            "url": "http://new-endpoint/v1",
+        "name": "新端點",
+        "url": "https://新的地址/v1",
        "models": ["gpt-oss-120b"]
    }
-    ]
-}
+]
 ```

-## License
+## 📄 授權條款

-MIT License - See LICENSE file for details
+本專案採用 MIT 授權條款，可自由使用、修改和分發。

-## Contributing
+## 🤝 問題回報

-1. Fork the repository
-2. Create your feature branch (`git checkout -b feature/amazing-feature`)
-3. Commit your changes (`git commit -m 'Add amazing feature'`)
-4. Push to the branch (`git push origin feature/amazing-feature`)
-5. Open a Pull Request
+如果遇到問題或有建議，歡迎在 Gitea 上開 Issue：
+https://gitea.theaken.com/aken1023/pj_llama/issues

-## Support
+## 📊 測試狀態

-For issues or questions:
-1. Check the [操作指南.md](操作指南.md) for detailed Chinese documentation
-2. Open an issue on GitHub
-3. Contact the API administrator for server-related issues
-
-## Acknowledgments
-
- Built with OpenAI Python SDK
- Compatible with OpenAI API format
- Supports multiple Llama model variants
+最後測試時間：2025-09-19
+- 📡 API 端點：正在測試中
+- 🔄 自動選擇：程式會自動選擇可用的端點

 ---

-**Last Updated**: 2025-09-19  
-**Version**: 1.0.0  
-**Status**: Internal endpoints working, external endpoints pending
+**版本**: 1.0.0  
+**作者**: Aken  
+**專案網址**: https://gitea.theaken.com/aken1023/pj_llama
--- a/demo_chat.py
+++ b/demo_chat.py
@@ -1,124 +0,0 @@
-"""
-Llama API 對話程式 (示範版本)
-當 API 伺服器恢復後，可以使用此程式進行對話
-"""
-
-from openai import OpenAI
-import time
-
-# API 設定
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "https://llama.theaken.com/v1"
-
-def simulate_chat():
-    """模擬對話功能（用於展示）"""
-    print("\n" + "="*50)
-    print("Llama AI 對話系統 - 示範模式")
-    print("="*50)
-    print("\n[注意] API 伺服器目前離線，以下為模擬對話")
-    print("當伺服器恢復後，將自動連接真實 API\n")
-    
-    # 模擬回應
-    demo_responses = [
-        "你好！我是 Llama AI 助手，很高興為你服務。",
-        "這是一個示範回應。當 API 伺服器恢復後，你將收到真實的 AI 回應。",
-        "我可以回答問題、協助編程、翻譯文字等多種任務。",
-        "請問有什麼我可以幫助你的嗎？"
-    ]
-    
-    response_index = 0
-    print("輸入 'exit' 結束對話\n")
-    
-    while True:
-        user_input = input("你: ").strip()
-        
-        if user_input.lower() in ['exit', 'quit']:
-            print("\n再見！")
-            break
-            
-        if not user_input:
-            continue
-        
-        # 模擬思考時間
-        print("\nAI 思考中", end="")
-        for _ in range(3):
-            time.sleep(0.3)
-            print(".", end="", flush=True)
-        print()
-        
-        # 顯示模擬回應
-        print(f"\nAI: {demo_responses[response_index % len(demo_responses)]}")
-        response_index += 1
-
-def real_chat():
-    """實際對話功能（當 API 可用時）"""
-    client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
-    
-    print("\n" + "="*50)
-    print("Llama AI 對話系統")
-    print("="*50)
-    print("\n已連接到 Llama API")
-    print("輸入 'exit' 結束對話\n")
-    
-    messages = []
-    
-    while True:
-        user_input = input("你: ").strip()
-        
-        if user_input.lower() in ['exit', 'quit']:
-            print("\n再見！")
-            break
-            
-        if not user_input:
-            continue
-            
-        messages.append({"role": "user", "content": user_input})
-        
-        try:
-            print("\nAI 思考中...")
-            response = client.chat.completions.create(
-                model="gpt-oss-120b",
-                messages=messages,
-                temperature=0.7,
-                max_tokens=1000
-            )
-            
-            ai_response = response.choices[0].message.content
-            print(f"\nAI: {ai_response}")
-            messages.append({"role": "assistant", "content": ai_response})
-            
-        except Exception as e:
-            print(f"\n[錯誤] {str(e)[:100]}")
-            print("無法取得回應，請稍後再試")
-
-def main():
-    print("檢查 API 連接狀態...")
-    
-    # 嘗試連接 API
-    try:
-        client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
-        
-        # 快速測試
-        response = client.chat.completions.create(
-            model="gpt-oss-120b",
-            messages=[{"role": "user", "content": "test"}],
-            max_tokens=10,
-            timeout=5
-        )
-        print("[成功] API 已連接")
-        real_chat()
-        
-    except Exception as e:
-        error_msg = str(e)
-        if "502" in error_msg or "Bad gateway" in error_msg:
-            print("[提示] API 伺服器目前離線 (502 錯誤)")
-            print("進入示範模式...")
-            simulate_chat()
-        else:
-            print(f"[錯誤] 無法連接: {error_msg[:100]}")
-            print("\n是否要進入示範模式? (y/n): ", end="")
-            if input().lower() == 'y':
-                simulate_chat()
-
-if __name__ == "__main__":
-    main()
--- a/llama_chat.py
+++ b/llama_chat.py
@@ -1,34 +1,45 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
-Llama 內網 API 對話程式
-支援多個端點和模型選擇
+Llama API 外網連接程式
+使用外網端點進行 AI 對話
 """

 from openai import OpenAI
+import requests
 import sys
 import re
+from datetime import datetime

-# API 配置
+# API 金鑰
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="

-# 可用端點 (前 3 個已測試可用)
+# 外網 API 端點配置
 ENDPOINTS = [
-    "http://192.168.0.6:21180/v1",
-    "http://192.168.0.6:21181/v1", 
-    "http://192.168.0.6:21182/v1",
-    "http://192.168.0.6:21183/v1"
+    {
+        "name": "Llama 通用端點",
+        "url": "https://llama.theaken.com/v1",
+        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
+    },
+    {
+        "name": "GPT-OSS 專用端點",
+        "url": "https://llama.theaken.com/v1/gpt-oss-120b",
+        "models": ["gpt-oss-120b"]
+    },
+    {
+        "name": "DeepSeek 專用端點",
+        "url": "https://llama.theaken.com/v1/deepseek-r1-671b",
+        "models": ["deepseek-r1-671b"]
+    }
 ]

-# 模型列表
-MODELS = [
-    "gpt-oss-120b",
-    "deepseek-r1-671b",
-    "qwen3-embedding-8b"
-]
+# 外網 API 端點配置（僅包含實際存在的端點）

 def clean_response(text):
    """清理 AI 回應中的特殊標記"""
+    if not text:
+        return text
+        
    # 移除思考標記
    if "<think>" in text:
        text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
@@ -47,34 +58,94 @@ def clean_response(text):
    
    return text

-def test_endpoint(endpoint):
+def test_endpoint(endpoint_info, timeout=10):
    """測試端點是否可用"""
+    url = endpoint_info["url"]
+    model = endpoint_info["models"][0] if endpoint_info["models"] else "gpt-oss-120b"
+    
+    print(f"  測試 {endpoint_info['name']}...", end="", flush=True)
+    
    try:
-        client = OpenAI(api_key=API_KEY, base_url=endpoint)
-        response = client.chat.completions.create(
-            model="gpt-oss-120b",
-            messages=[{"role": "user", "content": "Hi"}],
-            max_tokens=10,
-            timeout=5
+        # 處理特殊的模型端點 URL
+        if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
+            base_url = url.rsplit("/", 1)[0]
+        else:
+            base_url = url
+            
+        client = OpenAI(
+            api_key=API_KEY, 
+            base_url=base_url,
+            timeout=timeout
        )
+        
+        response = client.chat.completions.create(
+            model=model,
+            messages=[{"role": "user", "content": "test"}],
+            max_tokens=5
+        )
+        print(" ✓ 可用")
        return True
-    except:
+        
+    except Exception as e:
+        error_msg = str(e)
+        if "502" in error_msg:
+            print(" ✗ 伺服器暫時無法使用 (502)")
+        elif "timeout" in error_msg.lower():
+            print(" ✗ 連接超時")
+        elif "connection" in error_msg.lower():
+            print(" ✗ 無法連接")
+        else:
+            print(f" ✗ 錯誤")
        return False

-def chat_session(endpoint, model):
+def find_working_endpoint():
+    """尋找可用的端點"""
+    print("\n正在測試可用端點...")
+    print("-" * 50)
+    
+    for endpoint in ENDPOINTS:
+        if test_endpoint(endpoint):
+            return endpoint
+    
+    return None
+
+def chat_session(endpoint_info):
    """對話主程式"""
    print("\n" + "="*60)
    print("Llama AI 對話系統")
    print("="*60)
-    print(f"端點: {endpoint}")
-    print(f"模型: {model}")
+    print(f"使用端點: {endpoint_info['name']}")
+    print(f"URL: {endpoint_info['url']}")
+    print(f"可用模型: {', '.join(endpoint_info['models'])}")
    print("\n指令:")
    print("  exit/quit - 結束對話")
    print("  clear - 清空對話歷史")
    print("  model - 切換模型")
    print("-"*60)
    
-    client = OpenAI(api_key=API_KEY, base_url=endpoint)
+    # 處理 URL
+    url = endpoint_info["url"]
+    if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
+        base_url = url.rsplit("/", 1)[0]
+    else:
+        base_url = url
+    
+    client = OpenAI(api_key=API_KEY, base_url=base_url)
+    
+    # 選擇模型
+    if len(endpoint_info['models']) == 1:
+        current_model = endpoint_info['models'][0]
+    else:
+        print("\n選擇模型:")
+        for i, model in enumerate(endpoint_info['models'], 1):
+            print(f"  {i}. {model}")
+        choice = input("選擇 (預設: 1): ").strip()
+        if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
+            current_model = endpoint_info['models'][int(choice)-1]
+        else:
+            current_model = endpoint_info['models'][0]
+    
+    print(f"\n使用模型: {current_model}")
    messages = []
    
    while True:
@@ -94,13 +165,16 @@ def chat_session(endpoint, model):
                continue
                
            if user_input.lower() == 'model':
+                if len(endpoint_info['models']) == 1:
+                    print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
+                else:
                    print("\n可用模型:")
-                for i, m in enumerate(MODELS, 1):
+                    for i, m in enumerate(endpoint_info['models'], 1):
                        print(f"  {i}. {m}")
-                choice = input("選擇 (1-3): ").strip()
-                if choice in ['1', '2', '3']:
-                    model = MODELS[int(choice)-1]
-                    print(f"[系統] 已切換到 {model}")
+                    choice = input("選擇: ").strip()
+                    if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
+                        current_model = endpoint_info['models'][int(choice)-1]
+                        print(f"[系統] 已切換到 {current_model}")
                continue
            
            messages.append({"role": "user", "content": user_input})
@@ -109,7 +183,7 @@ def chat_session(endpoint, model):
            
            try:
                response = client.chat.completions.create(
-                    model=model,
+                    model=current_model,
                    messages=messages,
                    temperature=0.7,
                    max_tokens=1000
@@ -118,17 +192,14 @@ def chat_session(endpoint, model):
                ai_response = response.choices[0].message.content
                ai_response = clean_response(ai_response)
                
-                print("\r" + " "*20 + "\r", end="")  # 清除 "思考中..."
+                print("\r" + " "*20 + "\r", end="")
                print(f"AI: {ai_response}")
                
                messages.append({"role": "assistant", "content": ai_response})
                
-            except UnicodeEncodeError:
-                print("\r[錯誤] 編碼問題，請使用英文對話")
-                messages.pop()  # 移除最後的用戶訊息
            except Exception as e:
                print(f"\r[錯誤] {str(e)[:100]}")
-                messages.pop()  # 移除最後的用戶訊息
+                messages.pop()
                
        except KeyboardInterrupt:
            print("\n\n[中斷] 使用 exit 命令正常退出")
@@ -139,52 +210,33 @@ def chat_session(endpoint, model):

 def main():
    print("="*60)
-    print("Llama 內網 API 對話程式")
+    print("Llama AI 外網對話程式")
+    print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
    
-    # 測試端點
-    print("\n正在檢查可用端點...")
-    available = []
-    for i, endpoint in enumerate(ENDPOINTS[:3], 1):  # 只測試前3個
-        print(f"  測試 {endpoint}...", end="", flush=True)
-        if test_endpoint(endpoint):
-            print(" [OK]")
-            available.append(endpoint)
-        else:
-            print(" [失敗]")
+    # 尋找可用端點
+    working_endpoint = find_working_endpoint()
    
-    if not available:
-        print("\n[錯誤] 沒有可用的端點")
+    if not working_endpoint:
+        print("\n" + "="*60)
+        print("錯誤：無法連接到任何 API 端點")
+        print("="*60)
+        print("\n可能的原因：")
+        print("1. API 伺服器暫時離線")
+        print("2. 網路連接問題")
+        print("3. 防火牆或代理設定")
+        print("\n建議：")
+        print("1. 稍後再試（10-30分鐘後）")
+        print("2. 檢查網路連接")
+        print("3. 確認可以訪問 https://llama.theaken.com")
        sys.exit(1)
    
-    # 選擇端點
-    if len(available) == 1:
-        selected_endpoint = available[0]
-        print(f"\n使用端點: {selected_endpoint}")
-    else:
-        print(f"\n找到 {len(available)} 個可用端點:")
-        for i, ep in enumerate(available, 1):
-            print(f"  {i}. {ep}")
-        print("\n選擇端點 (預設: 1): ", end="")
-        choice = input().strip()
-        if choice and choice.isdigit() and 1 <= int(choice) <= len(available):
-            selected_endpoint = available[int(choice)-1]
-        else:
-            selected_endpoint = available[0]
-    
-    # 選擇模型
-    print("\n可用模型:")
-    for i, model in enumerate(MODELS, 1):
-        print(f"  {i}. {model}")
-    print("\n選擇模型 (預設: 1): ", end="")
-    choice = input().strip()
-    if choice in ['1', '2', '3']:
-        selected_model = MODELS[int(choice)-1]
-    else:
-        selected_model = MODELS[0]
+    print("\n" + "="*60)
+    print(f"成功連接到: {working_endpoint['name']}")
+    print("="*60)
    
    # 開始對話
-    chat_session(selected_endpoint, selected_model)
+    chat_session(working_endpoint)

 if __name__ == "__main__":
    try:
@@ -193,4 +245,6 @@ if __name__ == "__main__":
        print("\n\n程式已退出")
    except Exception as e:
        print(f"\n[錯誤] {e}")
+        import traceback
+        traceback.print_exc()
        sys.exit(1)
--- a/llama_full_api.py
+++ b/llama_full_api.py
@@ -16,40 +16,23 @@ API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="

 # API 端點配置
 ENDPOINTS = {
-    "內網": [
+    "主要": [
        {
-            "name": "內網端點 1 (21180)",
-            "url": "http://192.168.0.6:21180/v1",
+            "name": "Llama 通用端點",
+            "url": "https://llama.theaken.com/v1",
            "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
        },
        {
-            "name": "內網端點 2 (21181)",
-            "url": "http://192.168.0.6:21181/v1",
-            "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
-        },
-        {
-            "name": "內網端點 3 (21182)",
-            "url": "http://192.168.0.6:21182/v1",
-            "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
-        }
-    ],
-    "外網": [
-        {
-            "name": "外網 GPT-OSS-120B",
+            "name": "GPT-OSS 專用端點",
            "url": "https://llama.theaken.com/v1/gpt-oss-120b",
            "models": ["gpt-oss-120b"]
        },
        {
-            "name": "外網 DeepSeek-R1-671B",
+            "name": "DeepSeek 專用端點",
            "url": "https://llama.theaken.com/v1/deepseek-r1-671b",
            "models": ["deepseek-r1-671b"]
-        },
-        {
-            "name": "外網通用端點",
-            "url": "https://llama.theaken.com/v1",
-            "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
        }
-    ]
+    ],
 }

 def clean_response(text):
@@ -121,23 +104,13 @@ def test_all_endpoints():
    
    available_endpoints = []
    
-    # 測試內網端點
-    print("\n[內網端點測試]")
-    for endpoint in ENDPOINTS["內網"]:
+    # 測試所有端點
+    print("\n[端點測試]")
+    for endpoint in ENDPOINTS["主要"]:
        print(f"  測試 {endpoint['name']}...", end="", flush=True)
        if test_endpoint(endpoint):
            print(" [OK]")
-            available_endpoints.append(("內網", endpoint))
-        else:
-            print(" [FAIL]")
-    
-    # 測試外網端點
-    print("\n[外網端點測試]")
-    for endpoint in ENDPOINTS["外網"]:
-        print(f"  測試 {endpoint['name']}...", end="", flush=True)
-        if test_endpoint(endpoint):
-            print(" [OK]")
-            available_endpoints.append(("外網", endpoint))
+            available_endpoints.append(("主要", endpoint))
        else:
            print(" [FAIL]")
    
--- a/llama_test.py
+++ b/llama_test.py
@@ -1,99 +0,0 @@
-from openai import OpenAI
-import sys
-
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "https://llama.theaken.com/v1"
-
-AVAILABLE_MODELS = [
-    "gpt-oss-120b",
-    "deepseek-r1-671b",
-    "qwen3-embedding-8b"
-]
-
-def chat_with_llama(model_name="gpt-oss-120b"):
-    client = OpenAI(
-        api_key=API_KEY,
-        base_url=BASE_URL
-    )
-    
-    print(f"\n使用模型: {model_name}")
-    print("-" * 50)
-    print("輸入 'exit' 或 'quit' 來結束對話")
-    print("-" * 50)
-    
-    messages = []
-    
-    while True:
-        user_input = input("\n你: ").strip()
-        
-        if user_input.lower() in ['exit', 'quit']:
-            print("對話結束")
-            break
-            
-        if not user_input:
-            continue
-            
-        messages.append({"role": "user", "content": user_input})
-        
-        try:
-            response = client.chat.completions.create(
-                model=model_name,
-                messages=messages,
-                temperature=0.7,
-                max_tokens=2000
-            )
-            
-            assistant_reply = response.choices[0].message.content
-            print(f"\nAI: {assistant_reply}")
-            
-            messages.append({"role": "assistant", "content": assistant_reply})
-            
-        except Exception as e:
-            print(f"\n錯誤: {str(e)}")
-            print("請檢查網路連接和 API 設定")
-
-def test_connection():
-    print("測試連接到 Llama API...")
-    
-    client = OpenAI(
-        api_key=API_KEY,
-        base_url=BASE_URL
-    )
-    
-    try:
-        response = client.chat.completions.create(
-            model="gpt-oss-120b",
-            messages=[{"role": "user", "content": "Hello, this is a test message."}],
-            max_tokens=50
-        )
-        print("[OK] 連接成功!")
-        print(f"測試回應: {response.choices[0].message.content}")
-        return True
-    except Exception as e:
-        print(f"[ERROR] 連接失敗: {str(e)[:200]}")
-        return False
-
-def main():
-    print("=" * 50)
-    print("Llama 模型對話測試程式")
-    print("=" * 50)
-    
-    print("\n可用的模型:")
-    for i, model in enumerate(AVAILABLE_MODELS, 1):
-        print(f"  {i}. {model}")
-    
-    if test_connection():
-        print("\n選擇要使用的模型 (輸入數字 1-3，預設: 1):")
-        choice = input().strip()
-        
-        if choice == "2":
-            model = AVAILABLE_MODELS[1]
-        elif choice == "3":
-            model = AVAILABLE_MODELS[2]
-        else:
-            model = AVAILABLE_MODELS[0]
-        
-        chat_with_llama(model)
-
-if __name__ == "__main__":
-    main()
--- a/llama_universal.py
+++ b/llama_universal.py
@@ -0,0 +1,337 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""
+Llama API 通用對話程式
+支援內網和外網端點選擇
+"""
+
+from openai import OpenAI
+import requests
+import sys
+import re
+from datetime import datetime
+
+# API 金鑰
+API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
+
+# 網路環境配置
+NETWORK_CONFIG = {
+    "內網": {
+        "name": "內網環境",
+        "description": "公司/學校內部網路，使用內部 IP 地址",
+        "endpoints": [
+            {
+                "name": "內網端點 1",
+                "url": "http://192.168.0.6:21180/v1",
+                "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
+            },
+            {
+                "name": "內網端點 2",
+                "url": "http://192.168.0.6:21181/v1",
+                "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
+            },
+            {
+                "name": "內網端點 3",
+                "url": "http://192.168.0.6:21182/v1",
+                "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
+            }
+        ]
+    },
+    "外網": {
+        "name": "外網環境",
+        "description": "公開網際網路，使用外部域名",
+        "endpoints": [
+            {
+                "name": "通用端點",
+                "url": "https://llama.theaken.com/v1",
+                "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
+            },
+            {
+                "name": "GPT-OSS 專用端點",
+                "url": "https://llama.theaken.com/v1/gpt-oss-120b",
+                "models": ["gpt-oss-120b"]
+            },
+            {
+                "name": "DeepSeek 專用端點",
+                "url": "https://llama.theaken.com/v1/deepseek-r1-671b",
+                "models": ["deepseek-r1-671b"]
+            }
+        ]
+    }
+}
+
+def clean_response(text):
+    """清理 AI 回應中的特殊標記"""
+    if not text:
+        return text
+        
+    # 移除思考標記
+    if "<think>" in text:
+        text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
+    
+    # 移除 channel 標記
+    if "<|channel|>" in text:
+        parts = text.split("<|message|>")
+        if len(parts) > 1:
+            text = parts[-1]
+    
+    # 移除結束標記
+    text = text.replace("<|end|>", "").replace("<|start|>", "")
+    
+    # 清理多餘空白
+    text = text.strip()
+    
+    return text
+
+def test_endpoint(endpoint_info, timeout=8):
+    """測試端點是否可用"""
+    url = endpoint_info["url"]
+    model = endpoint_info["models"][0] if endpoint_info["models"] else "gpt-oss-120b"
+    
+    print(f"  測試 {endpoint_info['name']}...", end="", flush=True)
+    
+    try:
+        # 處理特殊的模型端點 URL
+        if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
+            base_url = url.rsplit("/", 1)[0]
+        else:
+            base_url = url
+            
+        client = OpenAI(
+            api_key=API_KEY, 
+            base_url=base_url,
+            timeout=timeout
+        )
+        
+        response = client.chat.completions.create(
+            model=model,
+            messages=[{"role": "user", "content": "test"}],
+            max_tokens=5
+        )
+        print(" [OK]")
+        return True
+        
+    except Exception as e:
+        error_msg = str(e)
+        if "502" in error_msg:
+            print(" [502 - 伺服器離線]")
+        elif "timeout" in error_msg.lower():
+            print(" [超時]")
+        elif "connection" in error_msg.lower():
+            print(" [連接失敗]")
+        else:
+            print(" [錯誤]")
+        return False
+
+def choose_network_environment():
+    """選擇網路環境"""
+    print("\n" + "="*60)
+    print("選擇網路環境")
+    print("="*60)
+    
+    print("\n可用環境：")
+    environments = list(NETWORK_CONFIG.keys())
+    
+    for i, env in enumerate(environments, 1):
+        config = NETWORK_CONFIG[env]
+        print(f"  {i}. {config['name']}")
+        print(f"     說明：{config['description']}")
+        print(f"     端點數量：{len(config['endpoints'])} 個")
+    
+    while True:
+        try:
+            choice = input(f"\n請選擇環境 (1-{len(environments)})，預設為 1: ").strip()
+            
+            if not choice:
+                choice = "1"
+                
+            choice_num = int(choice)
+            if 1 <= choice_num <= len(environments):
+                selected_env = environments[choice_num - 1]
+                print(f"\n已選擇：{NETWORK_CONFIG[selected_env]['name']}")
+                return selected_env
+            else:
+                print(f"請輸入 1-{len(environments)} 之間的數字")
+                
+        except ValueError:
+            print("請輸入有效的數字")
+        except KeyboardInterrupt:
+            print("\n\n程式已取消")
+            sys.exit(0)
+
+def find_working_endpoint(environment):
+    """尋找指定環境中的可用端點"""
+    config = NETWORK_CONFIG[environment]
+    
+    print(f"\n正在測試 {config['name']} 的端點...")
+    print("-" * 50)
+    
+    for endpoint in config['endpoints']:
+        if test_endpoint(endpoint):
+            return endpoint
+    
+    return None
+
+def chat_session(endpoint_info, environment):
+    """對話主程式"""
+    print("\n" + "="*60)
+    print("Llama AI 對話系統")
+    print("="*60)
+    print(f"網路環境: {NETWORK_CONFIG[environment]['name']}")
+    print(f"使用端點: {endpoint_info['name']}")
+    print(f"URL: {endpoint_info['url']}")
+    print(f"可用模型: {', '.join(endpoint_info['models'])}")
+    print("\n指令:")
+    print("  exit/quit - 結束對話")
+    print("  clear - 清空對話歷史")
+    print("  model - 切換模型")
+    print("  switch - 切換網路環境")
+    print("-"*60)
+    
+    # 處理 URL
+    url = endpoint_info["url"]
+    if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
+        base_url = url.rsplit("/", 1)[0]
+    else:
+        base_url = url
+    
+    client = OpenAI(api_key=API_KEY, base_url=base_url)
+    
+    # 選擇模型
+    if len(endpoint_info['models']) == 1:
+        current_model = endpoint_info['models'][0]
+    else:
+        print("\n選擇模型:")
+        for i, model in enumerate(endpoint_info['models'], 1):
+            print(f"  {i}. {model}")
+        choice = input("選擇 (預設: 1): ").strip()
+        if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
+            current_model = endpoint_info['models'][int(choice)-1]
+        else:
+            current_model = endpoint_info['models'][0]
+    
+    print(f"\n使用模型: {current_model}")
+    messages = []
+    
+    while True:
+        try:
+            user_input = input("\n你: ").strip()
+            
+            if not user_input:
+                continue
+                
+            if user_input.lower() in ['exit', 'quit']:
+                print("再見！")
+                break
+                
+            if user_input.lower() == 'clear':
+                messages = []
+                print("[系統] 對話歷史已清空")
+                continue
+                
+            if user_input.lower() == 'switch':
+                print("[系統] 重新啟動程式以切換網路環境")
+                return "switch"
+                
+            if user_input.lower() == 'model':
+                if len(endpoint_info['models']) == 1:
+                    print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
+                else:
+                    print("\n可用模型:")
+                    for i, m in enumerate(endpoint_info['models'], 1):
+                        print(f"  {i}. {m}")
+                    choice = input("選擇: ").strip()
+                    if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
+                        current_model = endpoint_info['models'][int(choice)-1]
+                        print(f"[系統] 已切換到 {current_model}")
+                continue
+            
+            messages.append({"role": "user", "content": user_input})
+            
+            print("\nAI 思考中...", end="", flush=True)
+            
+            try:
+                response = client.chat.completions.create(
+                    model=current_model,
+                    messages=messages,
+                    temperature=0.7,
+                    max_tokens=1000
+                )
+                
+                ai_response = response.choices[0].message.content
+                ai_response = clean_response(ai_response)
+                
+                print("\r" + " "*20 + "\r", end="")
+                print(f"AI: {ai_response}")
+                
+                messages.append({"role": "assistant", "content": ai_response})
+                
+            except Exception as e:
+                print(f"\r[錯誤] {str(e)[:100]}")
+                messages.pop()
+                
+        except KeyboardInterrupt:
+            print("\n\n[中斷] 使用 exit 命令正常退出")
+            continue
+        except EOFError:
+            print("\n再見！")
+            break
+    
+    return "exit"
+
+def main():
+    while True:
+        print("="*60)
+        print("Llama AI 通用對話程式")
+        print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+        print("="*60)
+        
+        # 選擇網路環境
+        environment = choose_network_environment()
+        
+        # 尋找可用端點
+        working_endpoint = find_working_endpoint(environment)
+        
+        if not working_endpoint:
+            print("\n" + "="*60)
+            print(f"錯誤：無法連接到任何 {NETWORK_CONFIG[environment]['name']} 端點")
+            print("="*60)
+            print("\n可能的原因：")
+            if environment == "內網":
+                print("1. 不在內網環境或 IP 設定錯誤")
+                print("2. 內網 API 服務未啟動")
+                print("3. 防火牆阻擋連接")
+                print("\n建議：嘗試選擇外網環境")
+            else:
+                print("1. 外網 API 伺服器暫時離線")
+                print("2. 網路連接問題")
+                print("3. 防火牆或代理設定")
+                print("\n建議：嘗試選擇內網環境或稍後再試")
+            
+            retry = input("\n是否重新選擇環境？(y/n，預設: y): ").strip().lower()
+            if retry in ['n', 'no']:
+                break
+            continue
+        
+        print("\n" + "="*60)
+        print(f"成功連接到: {working_endpoint['name']}")
+        print("="*60)
+        
+        # 開始對話
+        result = chat_session(working_endpoint, environment)
+        
+        if result == "exit":
+            break
+        elif result == "switch":
+            continue  # 重新開始選擇環境
+
+if __name__ == "__main__":
+    try:
+        main()
+    except KeyboardInterrupt:
+        print("\n\n程式已退出")
+    except Exception as e:
+        print(f"\n[錯誤] {e}")
+        import traceback
+        traceback.print_exc()
+        sys.exit(1)
--- a/local_api_test.py
+++ b/local_api_test.py
@@ -1,243 +0,0 @@
-"""
-內網 Llama API 測試程式
-使用 OpenAI 相容格式連接到本地 API 端點
-"""
-
-from openai import OpenAI
-import requests
-import json
-from datetime import datetime
-
-# API 配置
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-
-# 內網端點列表
-LOCAL_ENDPOINTS = [
-    "http://192.168.0.6:21180/v1",
-    "http://192.168.0.6:21181/v1",
-    "http://192.168.0.6:21182/v1",
-    "http://192.168.0.6:21183/v1"
-]
-
-# 可用模型
-MODELS = [
-    "gpt-oss-120b",
-    "deepseek-r1-671b",
-    "qwen3-embedding-8b"
-]
-
-def test_endpoint_with_requests(endpoint, model="gpt-oss-120b"):
-    """使用 requests 測試端點"""
-    print(f"\n[使用 requests 測試]")
-    print(f"端點: {endpoint}")
-    print(f"模型: {model}")
-    
-    headers = {
-        "Authorization": f"Bearer {API_KEY}",
-        "Content-Type": "application/json"
-    }
-    
-    data = {
-        "model": model,
-        "messages": [
-            {"role": "user", "content": "Say 'Hello, I am working!' if you can see this."}
-        ],
-        "temperature": 0.7,
-        "max_tokens": 50
-    }
-    
-    try:
-        response = requests.post(
-            f"{endpoint}/chat/completions",
-            headers=headers,
-            json=data,
-            timeout=10
-        )
-        
-        print(f"HTTP 狀態碼: {response.status_code}")
-        
-        if response.status_code == 200:
-            result = response.json()
-            if 'choices' in result:
-                content = result['choices'][0]['message']['content']
-                print(f"[SUCCESS] AI 回應: {content}")
-                return True
-            else:
-                print("[ERROR] 回應格式不正確")
-        else:
-            print(f"[ERROR] HTTP {response.status_code}")
-            if response.status_code != 502:  # 避免顯示 HTML 錯誤頁
-                print(f"詳情: {response.text[:200]}")
-                
-    except requests.exceptions.ConnectTimeout:
-        print("[TIMEOUT] 連接超時")
-    except requests.exceptions.ConnectionError:
-        print("[CONNECTION ERROR] 無法連接到端點")
-    except Exception as e:
-        print(f"[ERROR] {str(e)[:100]}")
-    
-    return False
-
-def test_endpoint_with_openai(endpoint, model="gpt-oss-120b"):
-    """使用 OpenAI SDK 測試端點"""
-    print(f"\n[使用 OpenAI SDK 測試]")
-    print(f"端點: {endpoint}")
-    print(f"模型: {model}")
-    
-    try:
-        client = OpenAI(
-            api_key=API_KEY,
-            base_url=endpoint,
-            timeout=10.0
-        )
-        
-        response = client.chat.completions.create(
-            model=model,
-            messages=[
-                {"role": "user", "content": "Hello, please respond with a simple greeting."}
-            ],
-            temperature=0.7,
-            max_tokens=50
-        )
-        
-        content = response.choices[0].message.content
-        print(f"[SUCCESS] AI 回應: {content}")
-        return True, client
-        
-    except Exception as e:
-        error_str = str(e)
-        if "Connection error" in error_str:
-            print("[CONNECTION ERROR] 無法連接到端點")
-        elif "timeout" in error_str.lower():
-            print("[TIMEOUT] 請求超時")
-        elif "502" in error_str:
-            print("[ERROR] 502 Bad Gateway")
-        else:
-            print(f"[ERROR] {error_str[:100]}")
-    
-    return False, None
-
-def find_working_endpoint():
-    """尋找可用的端點"""
-    print("="*60)
-    print(f"內網 API 端點測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
-    print("="*60)
-    
-    working_endpoints = []
-    
-    for endpoint in LOCAL_ENDPOINTS:
-        print(f"\n測試端點: {endpoint}")
-        print("-"*40)
-        
-        # 先用 requests 快速測試
-        if test_endpoint_with_requests(endpoint):
-            working_endpoints.append(endpoint)
-            print(f"[OK] 端點 {endpoint} 可用！")
-        else:
-            # 再用 OpenAI SDK 測試
-            success, _ = test_endpoint_with_openai(endpoint)
-            if success:
-                working_endpoints.append(endpoint)
-                print(f"[OK] 端點 {endpoint} 可用！")
-    
-    return working_endpoints
-
-def interactive_chat(endpoint, model="gpt-oss-120b"):
-    """互動式對話"""
-    print(f"\n連接到: {endpoint}")
-    print(f"使用模型: {model}")
-    print("="*60)
-    print("開始對話 (輸入 'exit' 結束)")
-    print("="*60)
-    
-    client = OpenAI(
-        api_key=API_KEY,
-        base_url=endpoint
-    )
-    
-    messages = []
-    
-    while True:
-        user_input = input("\n你: ").strip()
-        
-        if user_input.lower() in ['exit', 'quit']:
-            print("對話結束")
-            break
-            
-        if not user_input:
-            continue
-            
-        messages.append({"role": "user", "content": user_input})
-        
-        try:
-            print("\nAI 思考中...")
-            response = client.chat.completions.create(
-                model=model,
-                messages=messages,
-                temperature=0.7,
-                max_tokens=1000
-            )
-            
-            ai_response = response.choices[0].message.content
-            print(f"\nAI: {ai_response}")
-            messages.append({"role": "assistant", "content": ai_response})
-            
-        except Exception as e:
-            print(f"\n[ERROR] {str(e)[:100]}")
-
-def main():
-    # 尋找可用端點
-    working_endpoints = find_working_endpoint()
-    
-    print("\n" + "="*60)
-    print("測試結果總結")
-    print("="*60)
-    
-    if working_endpoints:
-        print(f"\n找到 {len(working_endpoints)} 個可用端點:")
-        for i, endpoint in enumerate(working_endpoints, 1):
-            print(f"  {i}. {endpoint}")
-        
-        # 選擇端點
-        if len(working_endpoints) == 1:
-            selected_endpoint = working_endpoints[0]
-            print(f"\n自動選擇唯一可用端點: {selected_endpoint}")
-        else:
-            print(f"\n請選擇要使用的端點 (1-{len(working_endpoints)}):")
-            choice = input().strip()
-            try:
-                idx = int(choice) - 1
-                if 0 <= idx < len(working_endpoints):
-                    selected_endpoint = working_endpoints[idx]
-                else:
-                    selected_endpoint = working_endpoints[0]
-            except:
-                selected_endpoint = working_endpoints[0]
-        
-        # 選擇模型
-        print("\n可用模型:")
-        for i, model in enumerate(MODELS, 1):
-            print(f"  {i}. {model}")
-        
-        print("\n請選擇模型 (1-3, 預設: 1):")
-        choice = input().strip()
-        if choice == "2":
-            selected_model = MODELS[1]
-        elif choice == "3":
-            selected_model = MODELS[2]
-        else:
-            selected_model = MODELS[0]
-        
-        # 開始對話
-        interactive_chat(selected_endpoint, selected_model)
-        
-    else:
-        print("\n[ERROR] 沒有找到可用的端點")
-        print("\n可能的原因:")
-        print("1. 內網 API 服務未啟動")
-        print("2. 防火牆阻擋了連接")
-        print("3. IP 地址或端口設定錯誤")
-        print("4. 不在同一個網路環境")
-
-if __name__ == "__main__":
-    main()
--- a/quick_test.py
+++ b/quick_test.py
@@ -1,21 +1,26 @@
 """
-快速測試內網 Llama API
+快速測試 Llama API 外網連接
 """

 from openai import OpenAI
+import sys

 # API 設定
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "http://192.168.0.6:21180/v1"  # 使用第一個可用端點
+BASE_URL = "https://llama.theaken.com/v1"  # 使用外網端點

 def quick_test():
-    print("連接到內網 API...")
-    print(f"端點: {BASE_URL}")
+    print("="*50)
+    print("Llama API 快速測試")
+    print("="*50)
+    print(f"連接到: {BASE_URL}")
    print("-" * 50)
    
+    try:
        client = OpenAI(
            api_key=API_KEY,
-        base_url=BASE_URL
+            base_url=BASE_URL,
+            timeout=15.0  # 15秒超時
        )
        
        # 測試對話
@@ -45,10 +50,27 @@ def quick_test():
                if "<|channel|>" in answer:
                    answer = answer.split("<|message|>")[-1].strip()
                    
-            print(f"答: {answer}")
+                print(f"答: {answer[:200]}")  # 限制顯示長度
                
            except Exception as e:
-            print(f"錯誤: {str(e)[:100]}")
+                error_msg = str(e)
+                if "502" in error_msg:
+                    print("錯誤: 伺服器暫時無法使用 (502)")
+                elif "timeout" in error_msg.lower():
+                    print("錯誤: 請求超時")
+                else:
+                    print(f"錯誤: {error_msg[:100]}")
+                    
+        print("\n" + "="*50)
+        print("測試完成！")
+        
+    except Exception as e:
+        print(f"\n連接失敗: {str(e)[:100]}")
+        print("\n建議:")
+        print("1. 檢查網路連接")
+        print("2. 確認可以訪問 https://llama.theaken.com")
+        print("3. 稍後再試（如果是 502 錯誤）")
+        sys.exit(1)

 if __name__ == "__main__":
    quick_test()
--- a/simple_llama_test.py
+++ b/simple_llama_test.py
@@ -1,46 +0,0 @@
-import requests
-import json
-
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "https://llama.theaken.com/v1/chat/completions"
-
-def test_api():
-    headers = {
-        "Authorization": f"Bearer {API_KEY}",
-        "Content-Type": "application/json"
-    }
-    
-    data = {
-        "model": "gpt-oss-120b",
-        "messages": [
-            {"role": "user", "content": "Hello, can you respond?"}
-        ],
-        "temperature": 0.7,
-        "max_tokens": 100
-    }
-    
-    print("正在測試 API 連接...")
-    print(f"URL: {BASE_URL}")
-    print(f"Model: gpt-oss-120b")
-    print("-" * 50)
-    
-    try:
-        response = requests.post(BASE_URL, headers=headers, json=data, timeout=30)
-        
-        if response.status_code == 200:
-            result = response.json()
-            print("[成功] API 回應:")
-            print(result['choices'][0]['message']['content'])
-        else:
-            print(f"[錯誤] HTTP {response.status_code}")
-            print(f"回應內容: {response.text[:500]}")
-            
-    except requests.exceptions.Timeout:
-        print("[錯誤] 請求超時")
-    except requests.exceptions.ConnectionError:
-        print("[錯誤] 無法連接到伺服器")
-    except Exception as e:
-        print(f"[錯誤] {str(e)}")
-
-if __name__ == "__main__":
-    test_api()
--- a/test_with_timeout.py
+++ b/test_with_timeout.py
@@ -1,111 +0,0 @@
-import requests
-import json
-from datetime import datetime
-
-# API 配置
-API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "https://llama.theaken.com/v1"
-
-def test_endpoints():
-    """測試不同的 API 端點和模型"""
-    
-    print("="*60)
-    print(f"Llama API 測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
-    print("="*60)
-    
-    headers = {
-        "Authorization": f"Bearer {API_KEY}",
-        "Content-Type": "application/json"
-    }
-    
-    # 測試配置
-    tests = [
-        {
-            "name": "GPT-OSS-120B",
-            "model": "gpt-oss-120b",
-            "prompt": "Say hello in one word"
-        },
-        {
-            "name": "DeepSeek-R1-671B", 
-            "model": "deepseek-r1-671b",
-            "prompt": "Say hello in one word"
-        },
-        {
-            "name": "Qwen3-Embedding-8B",
-            "model": "qwen3-embedding-8b",
-            "prompt": "Say hello in one word"
-        }
-    ]
-    
-    success_count = 0
-    
-    for test in tests:
-        print(f"\n[測試 {test['name']}]")
-        print("-"*40)
-        
-        data = {
-            "model": test["model"],
-            "messages": [
-                {"role": "user", "content": test["prompt"]}
-            ],
-            "temperature": 0.5,
-            "max_tokens": 20
-        }
-        
-        try:
-            # 使用較短的超時時間
-            response = requests.post(
-                f"{BASE_URL}/chat/completions",
-                headers=headers,
-                json=data,
-                timeout=15
-            )
-            
-            print(f"HTTP 狀態: {response.status_code}")
-            
-            if response.status_code == 200:
-                result = response.json()
-                if 'choices' in result:
-                    content = result['choices'][0]['message']['content']
-                    print(f"[SUCCESS] 成功回應: {content}")
-                    success_count += 1
-                else:
-                    print("[ERROR] 回應格式錯誤")
-            elif response.status_code == 502:
-                print("[ERROR] 502 Bad Gateway - 伺服器無法回應")
-            elif response.status_code == 401:
-                print("[ERROR] 401 Unauthorized - API 金鑰可能錯誤")
-            elif response.status_code == 404:
-                print("[ERROR] 404 Not Found - 模型或端點不存在")
-            else:
-                print(f"[ERROR] 錯誤 {response.status_code}")
-                if not response.text.startswith('<!DOCTYPE'):
-                    print(f"詳情: {response.text[:200]}")
-                    
-        except requests.exceptions.Timeout:
-            print("[TIMEOUT] 請求超時 (15秒)")
-        except requests.exceptions.ConnectionError as e:
-            print(f"[CONNECTION ERROR] 無法連接到伺服器")
-        except Exception as e:
-            print(f"[UNKNOWN ERROR]: {str(e)[:100]}")
-    
-    # 總結
-    print("\n" + "="*60)
-    print(f"測試結果: {success_count}/{len(tests)} 成功")
-    
-    if success_count == 0:
-        print("\n診斷資訊:")
-        print("• 網路連接: 正常 (可 ping 通)")
-        print("• API 端點: https://llama.theaken.com/v1")
-        print("• 錯誤類型: 502 Bad Gateway")
-        print("• 可能原因: 後端 API 服務暫時離線")
-        print("\n建議行動:")
-        print("1. 稍後再試 (建議 10-30 分鐘後)")
-        print("2. 聯繫 API 管理員確認服務狀態")
-        print("3. 檢查是否有服務維護公告")
-    else:
-        print(f"\n[OK] API 服務正常運作中!")
-        print(f"[OK] 可使用的模型數: {success_count}")
-
-if __name__ == "__main__":
-    test_endpoints()
--- a/使用說明.txt
+++ b/使用說明.txt
@@ -1,33 +0,0 @@
-===========================================
-Llama 模型對話測試程式 - 使用說明
-===========================================
-
-安裝步驟:
---------
-1. 確保已安裝 Python 3.7 或更高版本
-
-2. 安裝依賴套件:
-   pip install -r requirements.txt
-
-執行程式:
---------
-python llama_test.py
-
-功能說明:
---------
-1. 程式啟動後會自動測試 API 連接
-2. 選擇要使用的模型 (1-3)
-3. 開始與 AI 進行對話
-4. 輸入 'exit' 或 'quit' 結束對話
-
-可用模型:
---------
-1. gpt-oss-120b (預設)
-2. deepseek-r1-671b
-3. qwen3-embedding-8b
-
-注意事項:
---------
- 確保網路連接正常
- API 金鑰已內建於程式中
- 如遇到錯誤，請檢查網路連接或聯繫管理員
--- a/操作指南.md
+++ b/操作指南.md
@@ -9,20 +9,12 @@ paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=

 ### 可用端點

-#### 內網端點（已測試成功）
-| 端點名稱 | URL | 狀態 | 支援模型 |
-|---------|-----|------|---------|
-| 內網端點 1 | http://192.168.0.6:21180/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
-| 內網端點 2 | http://192.168.0.6:21181/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
-| 內網端點 3 | http://192.168.0.6:21182/v1 | ✅ 可用 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
-| 內網端點 4 | http://192.168.0.6:21183/v1 | ❌ 錯誤 | 500 Internal Server Error |
-
-#### 外網端點（待測試）
-| 端點名稱 | URL | 狀態 | 支援模型 |
-|---------|-----|------|---------|
-| GPT-OSS 專用 | https://llama.theaken.com/v1/gpt-oss-120b | 待測試 | gpt-oss-120b |
-| DeepSeek 專用 | https://llama.theaken.com/v1/deepseek-r1-671b | 待測試 | deepseek-r1-671b |
-| 通用端點 | https://llama.theaken.com/v1 | 待測試 | 所有模型 |
+#### 可用的外網端點
+| 端點名稱 | URL | 支援模型 |
+|---------|-----|---------|
+| 通用端點 | https://llama.theaken.com/v1 | gpt-oss-120b, deepseek-r1-671b, qwen3-embedding-8b |
+| GPT-OSS 專用 | https://llama.theaken.com/v1/gpt-oss-120b | gpt-oss-120b |
+| DeepSeek 專用 | https://llama.theaken.com/v1/deepseek-r1-671b | deepseek-r1-671b |

 ## 二、快速開始

@@ -33,13 +25,13 @@ pip install openai

 ### 2. 測試連接（Python）

-#### 內網連接範例
+#### 外網連接範例
 ```python
 from openai import OpenAI

 # 設定 API
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-BASE_URL = "http://192.168.0.6:21180/v1"  # 使用內網端點 1
+BASE_URL = "https://llama.theaken.com/v1"  # 使用外網端點

 # 創建客戶端
 client = OpenAI(
@@ -64,19 +56,19 @@ print(response.choices[0].message.content)
 ## 三、使用現成程式

 ### 程式清單
-1. **llama_full_api.py** - 完整對話程式（支援內外網）
-2. **llama_chat.py** - 內網專用對話程式
-3. **local_api_test.py** - 端點測試工具
-4. **quick_test.py** - 快速測試腳本
+1. **llama_chat.py** - 主要對話程式（智慧連接）
+2. **llama_full_api.py** - 完整對話程式（多端點支援）
+3. **quick_test.py** - 快速測試腳本
+4. **test_all_models.py** - 模型測試工具

 ### 執行對話程式
 ```bash
+# 執行主程式（智慧對話）
+python llama_chat.py
+
 # 執行完整版（自動測試所有端點）
 python llama_full_api.py

-# 執行內網版
-python llama_chat.py
-
 # 快速測試
 python quick_test.py
 ```
@@ -97,15 +89,15 @@ python quick_test.py
 ## 五、常見問題處理

 ### 問題 1：502 Bad Gateway
-**原因**：外網 API 伺服器離線  
-**解決**：使用內網端點
+**原因**：API 伺服器暫時離線  
+**解決**：稍後再試，程式會自動測試所有端點

 ### 問題 2：Connection Error
-**原因**：不在內網環境或 IP 錯誤  
+**原因**：網路連接問題  
 **解決**：
-1. 確認在同一網路環境
-2. 檢查防火牆設定
-3. ping 192.168.0.6 測試連通性
+1. 確認網路連接正常
+2. 檢查防火牆或代理設定
+3. 確認可以訪問 https://llama.theaken.com

 ### 問題 3：編碼錯誤
 **原因**：Windows 終端編碼問題  
@@ -124,14 +116,16 @@ python quick_test.py

 ## 七、測試結果摘要

-### 成功測試
-✅ 內網端點 1-3 全部正常運作  
+### 測試狀態
+📡 API 端點連接測試中  
 ✅ 支援 OpenAI SDK 標準格式  
-✅ 可正常進行對話  
+✅ 自動端點選擇機制  

-### 待確認
- 外網端點需等待伺服器恢復
- DeepSeek 和 Qwen 模型需進一步測試
+### 支援功能
+- 自動端點選擇
+- 智慧超時控制
+- 完整錯誤處理
+- 多模型支援（GPT-OSS、DeepSeek、Qwen）

 ## 八、技術細節

--- a/連線參數.txt
+++ b/連線參數.txt
@@ -1,14 +0,0 @@
-可以連接 llama 的模型，ai進行對話
-他的連線資料如下:
-
-外網連線：
-https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
-https://llama.theaken.com/v1https://llama.theaken.com/v1/deepseek-r1-671b/
-https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
-外網模型路徑：
-  1. /gpt-oss-120b/
-  2. /deepseek-r1-671b/
-  3. /qwen3-embedding-8b/
- 
-
-金鑰：paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=
Author	SHA1	Message	Date
aken1023	bad760a214	整合所有檔案到 pj_llama 目錄 - 遷移完整的專案到 pj_llama 目錄 - 包含所有最新檔案和 Git 歷史 - 移除不需要的舊檔案 - 統一開發和執行環境	2025-09-19 23:32:09 +08:00
aken1023	ece261ed8a	新增內網外網選擇功能新增功能： ✨ llama_universal.py - 主程式支援內外網環境選擇 🌐 內網端點：http://192.168.0.6:21180-21182/v1 🌐 外網端點：https://llama.theaken.com/v1/* 🔄 對話中可使用 'switch' 指令切換環境 📱 智慧環境偵測和錯誤處理更新內容： - 新增網路環境選擇介面 - 支援內網三個端點 (21180, 21181, 21182) - 支援外網三個端點（通用、GPT專用、DeepSeek專用） - 新增 switch 指令可在對話中切換環境 - 完整的錯誤處理和重試機制 - 更新 README.md 說明新功能和使用方式現在用戶可以根據所在網路環境自由選擇最適合的連接方式！	2025-09-19 23:27:19 +08:00
aken1023	390a8cc7f7	移除不存在的備用端點變更內容： - 從 llama_chat.py 移除備用端點配置 - 從 llama_full_api.py 移除備用端點配置 - 簡化端點測試邏輯 - 更新所有文檔移除備用端點說明 - 專注於實際存在的三個端點： * https://llama.theaken.com/v1 * https://llama.theaken.com/v1/gpt-oss-120b * https://llama.theaken.com/v1/deepseek-r1-671b 程式結構更清晰，移除虛假的備用選項。	2025-09-19 22:09:15 +08:00
aken1023	3c0fba5fc8	完全移除內網連接，專注外網API 變更項目： - 刪除所有內網相關程式檔案 - 移除內網IP參考 (192.168.x.x) - 精簡檔案結構，只保留核心程式 - 重命名主程式為 llama_chat.py - 更新所有文檔移除內網內容 - 專注於外網 API 連接和多端點支援保留檔案： - llama_chat.py (主程式) - llama_full_api.py (完整版) - quick_test.py (快速測試) - test_all_models.py (模型測試) - README.md, 操作指南.md (文檔)	2025-09-19 22:07:01 +08:00
aken1023	e71495ece4	重構為外網連接版本主要變更： - 移除所有內網 IP (192.168.x.x) - 改用外網端點 (https://llama.theaken.com) - 新增 llama_external_api.py 專門處理外網連接 - 更新所有文檔為外網版本 - 加入備用端點自動切換機制 - 優化錯誤處理和超時設定	2025-09-19 22:04:10 +08:00
aken1023	34fcf39fda	更新 README.md 為中文版本 - 改為簡單易懂的中文說明 - 加入詳細的使用教學 - 增加常見問題解答 - 提供範例程式碼 - 加入 emoji 讓內容更生動	2025-09-19 21:59:47 +08:00