完全移除內網連接，專注外網API

變更項目： - 刪除所有內網相關程式檔案 - 移除內網IP參考 (192.168.x.x) - 精簡檔案結構，只保留核心程式 - 重命名主程式為 llama_chat.py - 更新所有文檔移除內網內容 - 專注於外網 API 連接和多端點支援保留檔案： - llama_chat.py (主程式) - llama_full_api.py (完整版) - quick_test.py (快速測試) - test_all_models.py (模型測試) - README.md, 操作指南.md (文檔)
2025-09-19 22:07:01 +08:00
parent e71495ece4
commit 3c0fba5fc8
12 changed files with 181 additions and 1043 deletions
--- a/.claude/settings.local.json
+++ b/.claude/settings.local.json
@@ -12,7 +12,9 @@
      "Bash(git remote add:*)",
      "Bash(git branch:*)",
      "Bash(git push:*)",
-      "Bash(git pull:*)"
+      "Bash(git pull:*)",
      "Bash(rm:*)",
      "Bash(mv:*)"
    ],
    "defaultMode": "acceptEdits"
  }
--- a/README.md
+++ b/README.md
@@ -35,11 +35,11 @@ cd pj_llama
 ### 3. 執行對話程式
 ```bash
-# 執行主程式（自動選擇最佳連接）
+# 執行主程式（智慧對話）
 python llama_full_api.py
 # 或執行內網專用版本
 python llama_chat.py
 # 或執行完整版本（支援多端點）
 python llama_full_api.py
 ```
 ## 📖 使用說明
@@ -86,8 +86,8 @@ AI: 1+1等於2。
 | 檔案名稱 | 用途說明 |
 |---------|---------|
-| `llama_external_api.py` | **主程式** - 外網連接專用版本 |
+| `llama_chat.py` | **主程式** - 智慧對話程式 |
-| `llama_full_api.py` | 完整功能版本，支援所有端點 |
+| `llama_full_api.py` | 完整功能版本，支援多端點切換 |
 | `quick_test.py` | 快速測試連接是否正常 |
 | `test_all_models.py` | 測試所有模型的工具 |
--- a/demo_chat.py
+++ b/demo_chat.py
@@ -1,124 +0,0 @@
 """
 Llama API 對話程式 (示範版本)
 當 API 伺服器恢復後，可以使用此程式進行對話
 """
 from openai import OpenAI
 import time
 # API 設定
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 BASE_URL = "https://llama.theaken.com/v1"
 def simulate_chat():
    """模擬對話功能（用於展示）"""
    print("\n" + "="*50)
    print("Llama AI 對話系統 - 示範模式")
    print("="*50)
    print("\n[注意] API 伺服器目前離線，以下為模擬對話")
    print("當伺服器恢復後，將自動連接真實 API\n")
    # 模擬回應
    demo_responses = [
        "你好！我是 Llama AI 助手，很高興為你服務。",
        "這是一個示範回應。當 API 伺服器恢復後，你將收到真實的 AI 回應。",
        "我可以回答問題、協助編程、翻譯文字等多種任務。",
        "請問有什麼我可以幫助你的嗎？"
    ]
    response_index = 0
    print("輸入 'exit' 結束對話\n")
    while True:
        user_input = input("你: ").strip()
        if user_input.lower() in ['exit', 'quit']:
            print("\n再見！")
            break
        if not user_input:
            continue
        # 模擬思考時間
        print("\nAI 思考中", end="")
        for _ in range(3):
            time.sleep(0.3)
            print(".", end="", flush=True)
        print()
        # 顯示模擬回應
        print(f"\nAI: {demo_responses[response_index % len(demo_responses)]}")
        response_index += 1
 def real_chat():
    """實際對話功能（當 API 可用時）"""
    client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
    print("\n" + "="*50)
    print("Llama AI 對話系統")
    print("="*50)
    print("\n已連接到 Llama API")
    print("輸入 'exit' 結束對話\n")
    messages = []
    while True:
        user_input = input("你: ").strip()
        if user_input.lower() in ['exit', 'quit']:
            print("\n再見！")
            break
        if not user_input:
            continue
        messages.append({"role": "user", "content": user_input})
        try:
            print("\nAI 思考中...")
            response = client.chat.completions.create(
                model="gpt-oss-120b",
                messages=messages,
                temperature=0.7,
                max_tokens=1000
            )
            ai_response = response.choices[0].message.content
            print(f"\nAI: {ai_response}")
            messages.append({"role": "assistant", "content": ai_response})
        except Exception as e:
            print(f"\n[錯誤] {str(e)[:100]}")
            print("無法取得回應，請稍後再試")
 def main():
    print("檢查 API 連接狀態...")
    # 嘗試連接 API
    try:
        client = OpenAI(api_key=API_KEY, base_url=BASE_URL)
        # 快速測試
        response = client.chat.completions.create(
            model="gpt-oss-120b",
            messages=[{"role": "user", "content": "test"}],
            max_tokens=10,
            timeout=5
        )
        print("[成功] API 已連接")
        real_chat()
    except Exception as e:
        error_msg = str(e)
        if "502" in error_msg or "Bad gateway" in error_msg:
            print("[提示] API 伺服器目前離線 (502 錯誤)")
            print("進入示範模式...")
            simulate_chat()
        else:
            print(f"[錯誤] 無法連接: {error_msg[:100]}")
            print("\n是否要進入示範模式? (y/n): ", end="")
            if input().lower() == 'y':
                simulate_chat()
 if __name__ == "__main__":
    main()
--- a/llama_chat.py
+++ b/llama_chat.py
@@ -1,34 +1,57 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
-Llama 內網 API 對話程式
+Llama API 外網連接程式
-支援多個端點和模型選擇
+使用外網端點進行 AI 對話
 """
 from openai import OpenAI
 import requests
 import sys
 import re
 from datetime import datetime
-# API 配置
+# API 金鑰
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
-# 可用端點 (前 3 個已測試可用)
+# 外網 API 端點配置
 ENDPOINTS = [
-    "http://192.168.0.6:21180/v1",
+    {
-    "http://192.168.0.6:21181/v1", 
+        "name": "Llama 通用端點",
-    "http://192.168.0.6:21182/v1",
+        "url": "https://llama.theaken.com/v1",
-    "http://192.168.0.6:21183/v1"
+        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    },
    {
        "name": "GPT-OSS 專用端點",
        "url": "https://llama.theaken.com/v1/gpt-oss-120b",
        "models": ["gpt-oss-120b"]
    },
    {
        "name": "DeepSeek 專用端點",
        "url": "https://llama.theaken.com/v1/deepseek-r1-671b",
        "models": ["deepseek-r1-671b"]
    }
 ]
-# 模型列表
+# 備用外網端點（如果主要端點無法使用）
-MODELS = [
+BACKUP_ENDPOINTS = [
-    "gpt-oss-120b",
+    {
-    "deepseek-r1-671b",
+        "name": "備用端點 1",
-    "qwen3-embedding-8b"
+        "url": "https://api.llama.theaken.com/v1",
        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    },
    {
        "name": "備用端點 2", 
        "url": "https://llama-api.theaken.com/v1",
        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    }
 ]
 def clean_response(text):
    """清理 AI 回應中的特殊標記"""
    if not text:
        return text
    # 移除思考標記
    if "<think>" in text:
        text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
@@ -47,34 +70,102 @@ def clean_response(text):
    return text
-def test_endpoint(endpoint):
+def test_endpoint(endpoint_info, timeout=10):
    """測試端點是否可用"""
    url = endpoint_info["url"]
    model = endpoint_info["models"][0] if endpoint_info["models"] else "gpt-oss-120b"
    print(f"  測試 {endpoint_info['name']}...", end="", flush=True)
    try:
-        client = OpenAI(api_key=API_KEY, base_url=endpoint)
+        # 處理特殊的模型端點 URL
-        response = client.chat.completions.create(
+        if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
-            model="gpt-oss-120b",
+            base_url = url.rsplit("/", 1)[0]
-            messages=[{"role": "user", "content": "Hi"}],
+        else:
-            max_tokens=10,
+            base_url = url
-            timeout=5
+            
        client = OpenAI(
            api_key=API_KEY, 
            base_url=base_url,
            timeout=timeout
        )
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": "test"}],
            max_tokens=5
        )
        print(" ✓ 可用")
        return True
-    except:
+        
    except Exception as e:
        error_msg = str(e)
        if "502" in error_msg:
            print(" ✗ 伺服器暫時無法使用 (502)")
        elif "timeout" in error_msg.lower():
            print(" ✗ 連接超時")
        elif "connection" in error_msg.lower():
            print(" ✗ 無法連接")
        else:
            print(f" ✗ 錯誤")
        return False
-def chat_session(endpoint, model):
+def find_working_endpoint():
    """尋找可用的端點"""
    print("\n正在測試外網端點...")
    print("-" * 50)
    # 先測試主要端點
    print("主要端點：")
    for endpoint in ENDPOINTS:
        if test_endpoint(endpoint):
            return endpoint
    # 如果主要端點都不可用，測試備用端點
    print("\n備用端點：")
    for endpoint in BACKUP_ENDPOINTS:
        if test_endpoint(endpoint):
            return endpoint
    return None
 def chat_session(endpoint_info):
    """對話主程式"""
    print("\n" + "="*60)
    print("Llama AI 對話系統")
    print("="*60)
-    print(f"端點: {endpoint}")
+    print(f"使用端點: {endpoint_info['name']}")
-    print(f"模型: {model}")
+    print(f"URL: {endpoint_info['url']}")
    print(f"可用模型: {', '.join(endpoint_info['models'])}")
    print("\n指令:")
    print("  exit/quit - 結束對話")
    print("  clear - 清空對話歷史")
    print("  model - 切換模型")
    print("-"*60)
-    client = OpenAI(api_key=API_KEY, base_url=endpoint)
+    # 處理 URL
    url = endpoint_info["url"]
    if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
        base_url = url.rsplit("/", 1)[0]
    else:
        base_url = url
    client = OpenAI(api_key=API_KEY, base_url=base_url)
    # 選擇模型
    if len(endpoint_info['models']) == 1:
        current_model = endpoint_info['models'][0]
    else:
        print("\n選擇模型:")
        for i, model in enumerate(endpoint_info['models'], 1):
            print(f"  {i}. {model}")
        choice = input("選擇 (預設: 1): ").strip()
        if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
            current_model = endpoint_info['models'][int(choice)-1]
        else:
            current_model = endpoint_info['models'][0]
    print(f"\n使用模型: {current_model}")
    messages = []
    while True:
@@ -94,13 +185,16 @@ def chat_session(endpoint, model):
                continue
            if user_input.lower() == 'model':
-                print("\n可用模型:")
+                if len(endpoint_info['models']) == 1:
-                for i, m in enumerate(MODELS, 1):
+                    print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
-                    print(f"  {i}. {m}")
+                else:
-                choice = input("選擇 (1-3): ").strip()
+                    print("\n可用模型:")
-                if choice in ['1', '2', '3']:
+                    for i, m in enumerate(endpoint_info['models'], 1):
-                    model = MODELS[int(choice)-1]
+                        print(f"  {i}. {m}")
-                    print(f"[系統] 已切換到 {model}")
+                    choice = input("選擇: ").strip()
                    if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
                        current_model = endpoint_info['models'][int(choice)-1]
                        print(f"[系統] 已切換到 {current_model}")
                continue
            messages.append({"role": "user", "content": user_input})
@@ -109,7 +203,7 @@ def chat_session(endpoint, model):
            try:
                response = client.chat.completions.create(
-                    model=model,
+                    model=current_model,
                    messages=messages,
                    temperature=0.7,
                    max_tokens=1000
@@ -118,17 +212,14 @@ def chat_session(endpoint, model):
                ai_response = response.choices[0].message.content
                ai_response = clean_response(ai_response)
-                print("\r" + " "*20 + "\r", end="")  # 清除 "思考中..."
+                print("\r" + " "*20 + "\r", end="")
                print(f"AI: {ai_response}")
                messages.append({"role": "assistant", "content": ai_response})
            except UnicodeEncodeError:
                print("\r[錯誤] 編碼問題，請使用英文對話")
                messages.pop()  # 移除最後的用戶訊息
            except Exception as e:
                print(f"\r[錯誤] {str(e)[:100]}")
-                messages.pop()  # 移除最後的用戶訊息
+                messages.pop()
        except KeyboardInterrupt:
            print("\n\n[中斷] 使用 exit 命令正常退出")
@@ -139,52 +230,33 @@ def chat_session(endpoint, model):
 def main():
    print("="*60)
-    print("Llama 內網 API 對話程式")
+    print("Llama AI 外網對話程式")
    print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
-    # 測試端點
+    # 尋找可用端點
-    print("\n正在檢查可用端點...")
+    working_endpoint = find_working_endpoint()
    available = []
    for i, endpoint in enumerate(ENDPOINTS[:3], 1):  # 只測試前3個
        print(f"  測試 {endpoint}...", end="", flush=True)
        if test_endpoint(endpoint):
            print(" [OK]")
            available.append(endpoint)
        else:
            print(" [失敗]")
-    if not available:
+    if not working_endpoint:
-        print("\n[錯誤] 沒有可用的端點")
+        print("\n" + "="*60)
        print("錯誤：無法連接到任何外網端點")
        print("="*60)
        print("\n可能的原因：")
        print("1. 外網 API 伺服器暫時離線")
        print("2. 網路連接問題")
        print("3. 防火牆或代理設定")
        print("\n建議：")
        print("1. 稍後再試（10-30分鐘後）")
        print("2. 檢查網路連接")
        print("3. 聯繫 API 管理員")
        sys.exit(1)
-    # 選擇端點
+    print("\n" + "="*60)
-    if len(available) == 1:
+    print(f"成功連接到: {working_endpoint['name']}")
-        selected_endpoint = available[0]
+    print("="*60)
        print(f"\n使用端點: {selected_endpoint}")
    else:
        print(f"\n找到 {len(available)} 個可用端點:")
        for i, ep in enumerate(available, 1):
            print(f"  {i}. {ep}")
        print("\n選擇端點 (預設: 1): ", end="")
        choice = input().strip()
        if choice and choice.isdigit() and 1 <= int(choice) <= len(available):
            selected_endpoint = available[int(choice)-1]
        else:
            selected_endpoint = available[0]
    # 選擇模型
    print("\n可用模型:")
    for i, model in enumerate(MODELS, 1):
        print(f"  {i}. {model}")
    print("\n選擇模型 (預設: 1): ", end="")
    choice = input().strip()
    if choice in ['1', '2', '3']:
        selected_model = MODELS[int(choice)-1]
    else:
        selected_model = MODELS[0]
    # 開始對話
-    chat_session(selected_endpoint, selected_model)
+    chat_session(working_endpoint)
 if __name__ == "__main__":
    try:
@@ -193,4 +265,6 @@ if __name__ == "__main__":
        print("\n\n程式已退出")
    except Exception as e:
        print(f"\n[錯誤] {e}")
        import traceback
        traceback.print_exc()
        sys.exit(1)
--- a/llama_external_api.py
+++ b/llama_external_api.py
@@ -1,270 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
 Llama API 外網連接程式
 使用外網端點進行 AI 對話
 """
 from openai import OpenAI
 import requests
 import sys
 import re
 from datetime import datetime
 # API 金鑰
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 # 外網 API 端點配置
 ENDPOINTS = [
    {
        "name": "Llama 通用端點",
        "url": "https://llama.theaken.com/v1",
        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    },
    {
        "name": "GPT-OSS 專用端點",
        "url": "https://llama.theaken.com/v1/gpt-oss-120b",
        "models": ["gpt-oss-120b"]
    },
    {
        "name": "DeepSeek 專用端點",
        "url": "https://llama.theaken.com/v1/deepseek-r1-671b",
        "models": ["deepseek-r1-671b"]
    }
 ]
 # 備用外網端點（如果主要端點無法使用）
 BACKUP_ENDPOINTS = [
    {
        "name": "備用端點 1",
        "url": "https://api.llama.theaken.com/v1",
        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    },
    {
        "name": "備用端點 2", 
        "url": "https://llama-api.theaken.com/v1",
        "models": ["gpt-oss-120b", "deepseek-r1-671b", "qwen3-embedding-8b"]
    }
 ]
 def clean_response(text):
    """清理 AI 回應中的特殊標記"""
    if not text:
        return text
    # 移除思考標記
    if "<think>" in text:
        text = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
    # 移除 channel 標記
    if "<|channel|>" in text:
        parts = text.split("<|message|>")
        if len(parts) > 1:
            text = parts[-1]
    # 移除結束標記
    text = text.replace("<|end|>", "").replace("<|start|>", "")
    # 清理多餘空白
    text = text.strip()
    return text
 def test_endpoint(endpoint_info, timeout=10):
    """測試端點是否可用"""
    url = endpoint_info["url"]
    model = endpoint_info["models"][0] if endpoint_info["models"] else "gpt-oss-120b"
    print(f"  測試 {endpoint_info['name']}...", end="", flush=True)
    try:
        # 處理特殊的模型端點 URL
        if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
            base_url = url.rsplit("/", 1)[0]
        else:
            base_url = url
        client = OpenAI(
            api_key=API_KEY, 
            base_url=base_url,
            timeout=timeout
        )
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": "test"}],
            max_tokens=5
        )
        print(" ✓ 可用")
        return True
    except Exception as e:
        error_msg = str(e)
        if "502" in error_msg:
            print(" ✗ 伺服器暫時無法使用 (502)")
        elif "timeout" in error_msg.lower():
            print(" ✗ 連接超時")
        elif "connection" in error_msg.lower():
            print(" ✗ 無法連接")
        else:
            print(f" ✗ 錯誤")
        return False
 def find_working_endpoint():
    """尋找可用的端點"""
    print("\n正在測試外網端點...")
    print("-" * 50)
    # 先測試主要端點
    print("主要端點：")
    for endpoint in ENDPOINTS:
        if test_endpoint(endpoint):
            return endpoint
    # 如果主要端點都不可用，測試備用端點
    print("\n備用端點：")
    for endpoint in BACKUP_ENDPOINTS:
        if test_endpoint(endpoint):
            return endpoint
    return None
 def chat_session(endpoint_info):
    """對話主程式"""
    print("\n" + "="*60)
    print("Llama AI 對話系統")
    print("="*60)
    print(f"使用端點: {endpoint_info['name']}")
    print(f"URL: {endpoint_info['url']}")
    print(f"可用模型: {', '.join(endpoint_info['models'])}")
    print("\n指令:")
    print("  exit/quit - 結束對話")
    print("  clear - 清空對話歷史")
    print("  model - 切換模型")
    print("-"*60)
    # 處理 URL
    url = endpoint_info["url"]
    if url.endswith("/gpt-oss-120b") or url.endswith("/deepseek-r1-671b"):
        base_url = url.rsplit("/", 1)[0]
    else:
        base_url = url
    client = OpenAI(api_key=API_KEY, base_url=base_url)
    # 選擇模型
    if len(endpoint_info['models']) == 1:
        current_model = endpoint_info['models'][0]
    else:
        print("\n選擇模型:")
        for i, model in enumerate(endpoint_info['models'], 1):
            print(f"  {i}. {model}")
        choice = input("選擇 (預設: 1): ").strip()
        if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
            current_model = endpoint_info['models'][int(choice)-1]
        else:
            current_model = endpoint_info['models'][0]
    print(f"\n使用模型: {current_model}")
    messages = []
    while True:
        try:
            user_input = input("\n你: ").strip()
            if not user_input:
                continue
            if user_input.lower() in ['exit', 'quit']:
                print("再見！")
                break
            if user_input.lower() == 'clear':
                messages = []
                print("[系統] 對話歷史已清空")
                continue
            if user_input.lower() == 'model':
                if len(endpoint_info['models']) == 1:
                    print(f"[系統] 此端點只支援 {endpoint_info['models'][0]}")
                else:
                    print("\n可用模型:")
                    for i, m in enumerate(endpoint_info['models'], 1):
                        print(f"  {i}. {m}")
                    choice = input("選擇: ").strip()
                    if choice.isdigit() and 1 <= int(choice) <= len(endpoint_info['models']):
                        current_model = endpoint_info['models'][int(choice)-1]
                        print(f"[系統] 已切換到 {current_model}")
                continue
            messages.append({"role": "user", "content": user_input})
            print("\nAI 思考中...", end="", flush=True)
            try:
                response = client.chat.completions.create(
                    model=current_model,
                    messages=messages,
                    temperature=0.7,
                    max_tokens=1000
                )
                ai_response = response.choices[0].message.content
                ai_response = clean_response(ai_response)
                print("\r" + " "*20 + "\r", end="")
                print(f"AI: {ai_response}")
                messages.append({"role": "assistant", "content": ai_response})
            except Exception as e:
                print(f"\r[錯誤] {str(e)[:100]}")
                messages.pop()
        except KeyboardInterrupt:
            print("\n\n[中斷] 使用 exit 命令正常退出")
            continue
        except EOFError:
            print("\n再見！")
            break
 def main():
    print("="*60)
    print("Llama AI 外網對話程式")
    print(f"時間: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
    # 尋找可用端點
    working_endpoint = find_working_endpoint()
    if not working_endpoint:
        print("\n" + "="*60)
        print("錯誤：無法連接到任何外網端點")
        print("="*60)
        print("\n可能的原因：")
        print("1. 外網 API 伺服器暫時離線")
        print("2. 網路連接問題")
        print("3. 防火牆或代理設定")
        print("\n建議：")
        print("1. 稍後再試（10-30分鐘後）")
        print("2. 檢查網路連接")
        print("3. 聯繫 API 管理員")
        sys.exit(1)
    print("\n" + "="*60)
    print(f"成功連接到: {working_endpoint['name']}")
    print("="*60)
    # 開始對話
    chat_session(working_endpoint)
 if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\n\n程式已退出")
    except Exception as e:
        print(f"\n[錯誤] {e}")
        import traceback
        traceback.print_exc()
        sys.exit(1)
--- a/llama_test.py
+++ b/llama_test.py
@@ -1,99 +0,0 @@
 from openai import OpenAI
 import sys
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 BASE_URL = "https://llama.theaken.com/v1"
 AVAILABLE_MODELS = [
    "gpt-oss-120b",
    "deepseek-r1-671b",
    "qwen3-embedding-8b"
 ]
 def chat_with_llama(model_name="gpt-oss-120b"):
    client = OpenAI(
        api_key=API_KEY,
        base_url=BASE_URL
    )
    print(f"\n使用模型: {model_name}")
    print("-" * 50)
    print("輸入 'exit' 或 'quit' 來結束對話")
    print("-" * 50)
    messages = []
    while True:
        user_input = input("\n你: ").strip()
        if user_input.lower() in ['exit', 'quit']:
            print("對話結束")
            break
        if not user_input:
            continue
        messages.append({"role": "user", "content": user_input})
        try:
            response = client.chat.completions.create(
                model=model_name,
                messages=messages,
                temperature=0.7,
                max_tokens=2000
            )
            assistant_reply = response.choices[0].message.content
            print(f"\nAI: {assistant_reply}")
            messages.append({"role": "assistant", "content": assistant_reply})
        except Exception as e:
            print(f"\n錯誤: {str(e)}")
            print("請檢查網路連接和 API 設定")
 def test_connection():
    print("測試連接到 Llama API...")
    client = OpenAI(
        api_key=API_KEY,
        base_url=BASE_URL
    )
    try:
        response = client.chat.completions.create(
            model="gpt-oss-120b",
            messages=[{"role": "user", "content": "Hello, this is a test message."}],
            max_tokens=50
        )
        print("[OK] 連接成功!")
        print(f"測試回應: {response.choices[0].message.content}")
        return True
    except Exception as e:
        print(f"[ERROR] 連接失敗: {str(e)[:200]}")
        return False
 def main():
    print("=" * 50)
    print("Llama 模型對話測試程式")
    print("=" * 50)
    print("\n可用的模型:")
    for i, model in enumerate(AVAILABLE_MODELS, 1):
        print(f"  {i}. {model}")
    if test_connection():
        print("\n選擇要使用的模型 (輸入數字 1-3，預設: 1):")
        choice = input().strip()
        if choice == "2":
            model = AVAILABLE_MODELS[1]
        elif choice == "3":
            model = AVAILABLE_MODELS[2]
        else:
            model = AVAILABLE_MODELS[0]
        chat_with_llama(model)
 if __name__ == "__main__":
    main()
--- a/local_api_test.py
+++ b/local_api_test.py
@@ -1,243 +0,0 @@
 """
 內網 Llama API 測試程式
 使用 OpenAI 相容格式連接到本地 API 端點
 """
 from openai import OpenAI
 import requests
 import json
 from datetime import datetime
 # API 配置
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 # 內網端點列表
 LOCAL_ENDPOINTS = [
    "http://192.168.0.6:21180/v1",
    "http://192.168.0.6:21181/v1",
    "http://192.168.0.6:21182/v1",
    "http://192.168.0.6:21183/v1"
 ]
 # 可用模型
 MODELS = [
    "gpt-oss-120b",
    "deepseek-r1-671b",
    "qwen3-embedding-8b"
 ]
 def test_endpoint_with_requests(endpoint, model="gpt-oss-120b"):
    """使用 requests 測試端點"""
    print(f"\n[使用 requests 測試]")
    print(f"端點: {endpoint}")
    print(f"模型: {model}")
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    data = {
        "model": model,
        "messages": [
            {"role": "user", "content": "Say 'Hello, I am working!' if you can see this."}
        ],
        "temperature": 0.7,
        "max_tokens": 50
    }
    try:
        response = requests.post(
            f"{endpoint}/chat/completions",
            headers=headers,
            json=data,
            timeout=10
        )
        print(f"HTTP 狀態碼: {response.status_code}")
        if response.status_code == 200:
            result = response.json()
            if 'choices' in result:
                content = result['choices'][0]['message']['content']
                print(f"[SUCCESS] AI 回應: {content}")
                return True
            else:
                print("[ERROR] 回應格式不正確")
        else:
            print(f"[ERROR] HTTP {response.status_code}")
            if response.status_code != 502:  # 避免顯示 HTML 錯誤頁
                print(f"詳情: {response.text[:200]}")
    except requests.exceptions.ConnectTimeout:
        print("[TIMEOUT] 連接超時")
    except requests.exceptions.ConnectionError:
        print("[CONNECTION ERROR] 無法連接到端點")
    except Exception as e:
        print(f"[ERROR] {str(e)[:100]}")
    return False
 def test_endpoint_with_openai(endpoint, model="gpt-oss-120b"):
    """使用 OpenAI SDK 測試端點"""
    print(f"\n[使用 OpenAI SDK 測試]")
    print(f"端點: {endpoint}")
    print(f"模型: {model}")
    try:
        client = OpenAI(
            api_key=API_KEY,
            base_url=endpoint,
            timeout=10.0
        )
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": "Hello, please respond with a simple greeting."}
            ],
            temperature=0.7,
            max_tokens=50
        )
        content = response.choices[0].message.content
        print(f"[SUCCESS] AI 回應: {content}")
        return True, client
    except Exception as e:
        error_str = str(e)
        if "Connection error" in error_str:
            print("[CONNECTION ERROR] 無法連接到端點")
        elif "timeout" in error_str.lower():
            print("[TIMEOUT] 請求超時")
        elif "502" in error_str:
            print("[ERROR] 502 Bad Gateway")
        else:
            print(f"[ERROR] {error_str[:100]}")
    return False, None
 def find_working_endpoint():
    """尋找可用的端點"""
    print("="*60)
    print(f"內網 API 端點測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
    working_endpoints = []
    for endpoint in LOCAL_ENDPOINTS:
        print(f"\n測試端點: {endpoint}")
        print("-"*40)
        # 先用 requests 快速測試
        if test_endpoint_with_requests(endpoint):
            working_endpoints.append(endpoint)
            print(f"[OK] 端點 {endpoint} 可用！")
        else:
            # 再用 OpenAI SDK 測試
            success, _ = test_endpoint_with_openai(endpoint)
            if success:
                working_endpoints.append(endpoint)
                print(f"[OK] 端點 {endpoint} 可用！")
    return working_endpoints
 def interactive_chat(endpoint, model="gpt-oss-120b"):
    """互動式對話"""
    print(f"\n連接到: {endpoint}")
    print(f"使用模型: {model}")
    print("="*60)
    print("開始對話 (輸入 'exit' 結束)")
    print("="*60)
    client = OpenAI(
        api_key=API_KEY,
        base_url=endpoint
    )
    messages = []
    while True:
        user_input = input("\n你: ").strip()
        if user_input.lower() in ['exit', 'quit']:
            print("對話結束")
            break
        if not user_input:
            continue
        messages.append({"role": "user", "content": user_input})
        try:
            print("\nAI 思考中...")
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                temperature=0.7,
                max_tokens=1000
            )
            ai_response = response.choices[0].message.content
            print(f"\nAI: {ai_response}")
            messages.append({"role": "assistant", "content": ai_response})
        except Exception as e:
            print(f"\n[ERROR] {str(e)[:100]}")
 def main():
    # 尋找可用端點
    working_endpoints = find_working_endpoint()
    print("\n" + "="*60)
    print("測試結果總結")
    print("="*60)
    if working_endpoints:
        print(f"\n找到 {len(working_endpoints)} 個可用端點:")
        for i, endpoint in enumerate(working_endpoints, 1):
            print(f"  {i}. {endpoint}")
        # 選擇端點
        if len(working_endpoints) == 1:
            selected_endpoint = working_endpoints[0]
            print(f"\n自動選擇唯一可用端點: {selected_endpoint}")
        else:
            print(f"\n請選擇要使用的端點 (1-{len(working_endpoints)}):")
            choice = input().strip()
            try:
                idx = int(choice) - 1
                if 0 <= idx < len(working_endpoints):
                    selected_endpoint = working_endpoints[idx]
                else:
                    selected_endpoint = working_endpoints[0]
            except:
                selected_endpoint = working_endpoints[0]
        # 選擇模型
        print("\n可用模型:")
        for i, model in enumerate(MODELS, 1):
            print(f"  {i}. {model}")
        print("\n請選擇模型 (1-3, 預設: 1):")
        choice = input().strip()
        if choice == "2":
            selected_model = MODELS[1]
        elif choice == "3":
            selected_model = MODELS[2]
        else:
            selected_model = MODELS[0]
        # 開始對話
        interactive_chat(selected_endpoint, selected_model)
    else:
        print("\n[ERROR] 沒有找到可用的端點")
        print("\n可能的原因:")
        print("1. 內網 API 服務未啟動")
        print("2. 防火牆阻擋了連接")
        print("3. IP 地址或端口設定錯誤")
        print("4. 不在同一個網路環境")
 if __name__ == "__main__":
    main()
--- a/simple_llama_test.py
+++ b/simple_llama_test.py
@@ -1,46 +0,0 @@
 import requests
 import json
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 BASE_URL = "https://llama.theaken.com/v1/chat/completions"
 def test_api():
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    data = {
        "model": "gpt-oss-120b",
        "messages": [
            {"role": "user", "content": "Hello, can you respond?"}
        ],
        "temperature": 0.7,
        "max_tokens": 100
    }
    print("正在測試 API 連接...")
    print(f"URL: {BASE_URL}")
    print(f"Model: gpt-oss-120b")
    print("-" * 50)
    try:
        response = requests.post(BASE_URL, headers=headers, json=data, timeout=30)
        if response.status_code == 200:
            result = response.json()
            print("[成功] API 回應:")
            print(result['choices'][0]['message']['content'])
        else:
            print(f"[錯誤] HTTP {response.status_code}")
            print(f"回應內容: {response.text[:500]}")
    except requests.exceptions.Timeout:
        print("[錯誤] 請求超時")
    except requests.exceptions.ConnectionError:
        print("[錯誤] 無法連接到伺服器")
    except Exception as e:
        print(f"[錯誤] {str(e)}")
 if __name__ == "__main__":
    test_api()
--- a/test_with_timeout.py
+++ b/test_with_timeout.py
@@ -1,111 +0,0 @@
 import requests
 import json
 from datetime import datetime
 # API 配置
 API_KEY = "paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo="
 BASE_URL = "https://llama.theaken.com/v1"
 def test_endpoints():
    """測試不同的 API 端點和模型"""
    print("="*60)
    print(f"Llama API 測試 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    # 測試配置
    tests = [
        {
            "name": "GPT-OSS-120B",
            "model": "gpt-oss-120b",
            "prompt": "Say hello in one word"
        },
        {
            "name": "DeepSeek-R1-671B", 
            "model": "deepseek-r1-671b",
            "prompt": "Say hello in one word"
        },
        {
            "name": "Qwen3-Embedding-8B",
            "model": "qwen3-embedding-8b",
            "prompt": "Say hello in one word"
        }
    ]
    success_count = 0
    for test in tests:
        print(f"\n[測試 {test['name']}]")
        print("-"*40)
        data = {
            "model": test["model"],
            "messages": [
                {"role": "user", "content": test["prompt"]}
            ],
            "temperature": 0.5,
            "max_tokens": 20
        }
        try:
            # 使用較短的超時時間
            response = requests.post(
                f"{BASE_URL}/chat/completions",
                headers=headers,
                json=data,
                timeout=15
            )
            print(f"HTTP 狀態: {response.status_code}")
            if response.status_code == 200:
                result = response.json()
                if 'choices' in result:
                    content = result['choices'][0]['message']['content']
                    print(f"[SUCCESS] 成功回應: {content}")
                    success_count += 1
                else:
                    print("[ERROR] 回應格式錯誤")
            elif response.status_code == 502:
                print("[ERROR] 502 Bad Gateway - 伺服器無法回應")
            elif response.status_code == 401:
                print("[ERROR] 401 Unauthorized - API 金鑰可能錯誤")
            elif response.status_code == 404:
                print("[ERROR] 404 Not Found - 模型或端點不存在")
            else:
                print(f"[ERROR] 錯誤 {response.status_code}")
                if not response.text.startswith('<!DOCTYPE'):
                    print(f"詳情: {response.text[:200]}")
        except requests.exceptions.Timeout:
            print("[TIMEOUT] 請求超時 (15秒)")
        except requests.exceptions.ConnectionError as e:
            print(f"[CONNECTION ERROR] 無法連接到伺服器")
        except Exception as e:
            print(f"[UNKNOWN ERROR]: {str(e)[:100]}")
    # 總結
    print("\n" + "="*60)
    print(f"測試結果: {success_count}/{len(tests)} 成功")
    if success_count == 0:
        print("\n診斷資訊:")
        print("• 網路連接: 正常 (可 ping 通)")
        print("• API 端點: https://llama.theaken.com/v1")
        print("• 錯誤類型: 502 Bad Gateway")
        print("• 可能原因: 後端 API 服務暫時離線")
        print("\n建議行動:")
        print("1. 稍後再試 (建議 10-30 分鐘後)")
        print("2. 聯繫 API 管理員確認服務狀態")
        print("3. 檢查是否有服務維護公告")
    else:
        print(f"\n[OK] API 服務正常運作中!")
        print(f"[OK] 可使用的模型數: {success_count}")
 if __name__ == "__main__":
    test_endpoints()
--- a/使用說明.txt
+++ b/使用說明.txt
@@ -1,33 +0,0 @@
 ===========================================
 Llama 模型對話測試程式 - 使用說明
 ===========================================
 安裝步驟:
 ---------
 1. 確保已安裝 Python 3.7 或更高版本
 2. 安裝依賴套件:
   pip install -r requirements.txt
 執行程式:
 ---------
 python llama_test.py
 功能說明:
 ---------
 1. 程式啟動後會自動測試 API 連接
 2. 選擇要使用的模型 (1-3)
 3. 開始與 AI 進行對話
 4. 輸入 'exit' 或 'quit' 結束對話
 可用模型:
 ---------
 1. gpt-oss-120b (預設)
 2. deepseek-r1-671b
 3. qwen3-embedding-8b
 注意事項:
 ---------
 - 確保網路連接正常
 - API 金鑰已內建於程式中
 - 如遇到錯誤，請檢查網路連接或聯繫管理員
--- a/操作指南.md
+++ b/操作指南.md
@@ -62,19 +62,19 @@ print(response.choices[0].message.content)
 ## 三、使用現成程式
 ### 程式清單
-1. **llama_full_api.py** - 完整對話程式（支援內外網）
+1. **llama_chat.py** - 主要對話程式（智慧連接）
-2. **llama_chat.py** - 內網專用對話程式
+2. **llama_full_api.py** - 完整對話程式（多端點支援）
-3. **local_api_test.py** - 端點測試工具
+3. **quick_test.py** - 快速測試腳本
-4. **quick_test.py** - 快速測試腳本
+4. **test_all_models.py** - 模型測試工具
 ### 執行對話程式
 ```bash
 # 執行主程式（智慧對話）
 python llama_chat.py
 # 執行完整版（自動測試所有端點）
 python llama_full_api.py
 # 執行內網版
 python llama_chat.py
 # 快速測試
 python quick_test.py
 ```
@@ -95,15 +95,15 @@ python quick_test.py
 ## 五、常見問題處理
 ### 問題 1：502 Bad Gateway
-**原因**：外網 API 伺服器離線  
+**原因**：API 伺服器暫時離線  
-**解決**：使用內網端點
+**解決**：稍後再試或使用備用端點
 ### 問題 2：Connection Error
-**原因**：不在內網環境或 IP 錯誤  
+**原因**：網路連接問題  
 **解決**：
-1. 確認在同一網路環境
+1. 確認網路連接正常
-2. 檢查防火牆設定
+2. 檢查防火牆或代理設定
-3. ping 192.168.0.6 測試連通性
+3. 確認可以訪問 https://llama.theaken.com
 ### 問題 3：編碼錯誤
 **原因**：Windows 終端編碼問題  
@@ -122,14 +122,16 @@ python quick_test.py
 ## 七、測試結果摘要
-### 成功測試
+### 測試狀態
-✅ 內網端點 1-3 全部正常運作  
+📡 外網端點連接測試中  
 ✅ 支援 OpenAI SDK 標準格式  
-✅ 可正常進行對話  
+✅ 自動端點切換機制  
-### 待確認
+### 支援功能
- 外網端點需等待伺服器恢復
+- 多端點自動切換
- DeepSeek 和 Qwen 模型需進一步測試
+- 智慧超時控制
 - 完整錯誤處理
 - DeepSeek 和 Qwen 模型支援
 ## 八、技術細節
--- a/連線參數.txt
+++ b/連線參數.txt
@@ -1,14 +0,0 @@
 可以連接 llama 的模型，ai進行對話
 他的連線資料如下:
 外網連線：
 https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
 https://llama.theaken.com/v1https://llama.theaken.com/v1/deepseek-r1-671b/
 https://llama.theaken.com/v1https://llama.theaken.com/v1/gpt-oss-120b/
 外網模型路徑：
  1. /gpt-oss-120b/
  2. /deepseek-r1-671b/
  3. /qwen3-embedding-8b/
 金鑰：paVrIT+XU1NhwCAOb0X4aYi75QKogK5YNMGvQF1dCyo=