攝影或3C

Python 設定與讀取環境變數：以 GEMINI_API_KEY 為例; 如何串接Gemini API?如何傳遞圖片給(Gemini / OpenAI) API? MIME (Multipurpose Internet Mail Extensions)

在開發 Python 專案時，我們常常需要存放一些敏感資訊，例如 API Key、資料庫密碼。
如果把金鑰直接寫死在程式碼裡，不但不安全，還有可能被不小心公開到 GitHub。
更好的做法是：使用環境變數（Environment Variables）。

這篇文章就以 Google Gemini API Key (GEMINI_API_KEY) 為例，教你如何在 Windows 11 中設定，並在 Python 讀取。

一、什麼是環境變數？

環境變數就像是作業系統提供的「全域設定值」。
應用程式啟動時，可以從環境變數中讀取這些值，而不用把敏感資訊寫死在程式碼裡。

好處：

安全：避免 API Key 被硬編碼在程式中。
方便：不同專案或環境（開發 / 測試 / 生產）可以設定不同的環境變數。
跨平台：Linux、macOS、Windows 都有環境變數的機制。

二、在 Windows 11 中設定環境變數

1. 打開環境變數設定

按下 開始鍵，搜尋：編輯系統環境變數
點進去後，會出現「系統內容」視窗
切換到進階分頁
點擊下方的 環境變數(N)…

2. 新增環境變數

在 使用者變數 區塊中（只影響你自己的帳號）：

點擊新增
輸入：
- 變數名稱：GEMINI_API_KEY
- 變數值：你的 Gemini API Key（例如 AIzaSy...）

確認後按確定。

3. 套用並重新啟動

按下確定，關閉所有設定視窗
⚠️ 要重新啟動 VS Code / Jupyter Notebook / cmd / PowerShell 才能讀取最新的環境變數

三、在 Python 讀取環境變數

設定完成後，就可以在 Python 程式中讀取：

import os

api_key = os.getenv("GEMINI_API_KEY")

if api_key:
    print("成功讀取 GEMINI_API_KEY:", api_key[:10] + "...")
else:
    print("沒有找到 GEMINI_API_KEY，請確認是否已設定")

四、結論

環境變數是存放敏感資訊的最佳做法
Windows 11 可以透過「編輯系統環境變數」來永久設定
Python 使用 os.getenv("變數名稱") 就能讀取

這樣一來，你的 Gemini API Key 就能安全地存放在系統中，而不是寫死在程式碼裡。

👉 建議：如果要管理多個 API Key 或專案，也可以用 .env 檔案搭配 python-dotenv 套件，自動載入環境變數，更方便。

如何串接Gemini API?

#pip install google-genai
from google import genai
api_key = os.environ["GEMINI_API_KEY"]
client = genai.Client(api_key=api_key)
response = client.models.generate_content(
    model="gemini-2.0-flash", 
    contents="Write a story about a magic backpack."
)
print(response.text)

輸出:

指派角色:

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=[
        {"role": "user", "parts": [{"text": "Hello, who are you?"}]},
        {"role": "model", "parts": [{"text": "I'm Gemini, your AI assistant."}]},
        {"role": "user", "parts": [{"text": "Explain quantum computing in simple terms."}]}
    ]
)
print(response.text)

輸出:

傳遞圖片

import os
from google import genai

# 配置 API
with open(r"D:\Python code\api_key\api_key.json", "r", encoding="utf-8") as f:
    import json
    api_key = json.load(f)["Gemini"]["api_key"]

# 使用新的 API 初始化方式
client = genai.Client(api_key=api_key)

# 直接以二进制模式读取图片
img_path = r"D:\Python code\vision\resized_leone-africano-2.jpg"
with open(img_path, 'rb') as f:
    img_bytes = f.read()  # 这就是我们需要的二进制数据！
# 创建模型并发送请求 - 使用新的 API 调用方式
prompt = "請辨識這張圖片，並列出所有你看到的物件、概念和屬性的標籤，每個標籤一行。"
# 使用新的 API 调用方式
response = client.models.generate_content(
    model="gemini-2.5-pro",
    contents=[
        {"role": "user", 
         "parts": [
             {"text": prompt}, 
             {"inline_data": {"mime_type": "image/jpeg", 
                              "data": img_bytes}}
         ]}
    ]
)
print(response.text)

輸出結果:

OpenAI, AzureOpenAI, genai 比較
三種多模態 AI Client 完整教學

1. AzureOpenAI Client

from openai import AzureOpenAI
import base64
import mimetypes

# ========== 建立 Client ==========
client = AzureOpenAI(
    api_key="your-azure-key",
    azure_endpoint="https://your-resource.openai.azure.com/",
    api_version='2025-01-01-preview'
)

# ========== 準備圖片和 MIME ==========
image_path = "diagram.png"
mime_type, _ = mimetypes.guess_type(image_path)  # 'image/png'

with open(image_path, "rb") as f:
    image_base64:str = base64.b64encode(f.read()).decode("utf-8")

# ========== API 呼叫 ==========
response = client.chat.completions.create(
    model="gpt-4-vision",  # Azure 部署名稱
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", 
            "text": "分析這張圖片"},
            {"type": "image_url", 
            "image_url": {"url": f"data:{mime_type};base64,{image_base64}"}}
        ]
    }]
)

# ========== 獲取結果 ==========
result = response.choices[0].message.content
print(f"Azure 結果: {result}")

2. OpenAI Client

from openai import OpenAI
import base64
import mimetypes

# ========== 建立 Client ==========
client = OpenAI(
    api_key="your-openai-key"
)

# ========== 準備圖片和 MIME ==========
image_path = "diagram.jpg"
mime_type, _ = mimetypes.guess_type(image_path)  # 'image/jpeg'

with open(image_path, "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

# ========== API 呼叫 ==========
response = client.chat.completions.create(
    model="gpt-4o",  # OpenAI 官方模型名稱
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", 
            "text": "分析這張圖片"},
            {"type": "image_url", 
            "image_url": {"url": f"data:{mime_type};base64,{image_base64}"}}
        ]
    }]
)

# ========== 獲取結果 ==========
result = response.choices[0].message.content
print(f"OpenAI 結果: {result}")

3. Google Gemini Client

from google import genai
import mimetypes

# ========== 建立 Client ==========
client = genai.Client(
    api_key="your-gemini-key"
)

# ========== 準備圖片和 MIME ==========
image_path = "diagram.tiff"
mime_type, _ = mimetypes.guess_type(image_path)  # 'image/tiff'

with open(image_path, "rb") as f:
    img_bytes = f.read()  # 注意：Gemini 直接用 bytes

# ========== API 呼叫 ==========
response = client.models.generate_content(
    model="gemini-2.5-flash",  # Gemini 模型名稱
    contents=[{
        "role": "user",
        "parts": [
            {"text": "分析這張圖片"},
            {"inline_data": {"mime_type": mime_type, 
                             "data": img_bytes}}
        ]
    }]
)

# ========== 獲取結果 ==========
result = response.text
print(f"Gemini 結果: {result}")

關鍵差異對比表

MIME (Multipurpose Internet Mail Extensions)
MIME 的歷史和用途
原始用途
最初設計：為了在電子郵件中傳送非文字內容（如圖片、音訊、影片）
問題解決：早期電子郵件只能傳送純文字，MIME 讓郵件可以包含附件和多媒體
現代用途
雖然名稱中有 “Mail”，但 MIME 現在廣泛用於：

Web 瀏覽器：告訴瀏覽器如何處理不同檔案類型
HTTP 協定：Content-Type 標頭使用 MIME 類型
API 通訊：如你的腳本中，告訴 AI 模型圖片的格式
檔案系統：作業系統用來決定用什麼程式開啟檔案

常見 MIME 類型範例:

在 Data URI 中的使用

# 在 Data URI 中的使用
mime_type, _ = mimetypes.guess_type("diagram.png")
data_uri = f"data:{mime_type};base64,{base64_string}"  # OpenAI/Azure 用法

完整實用函式

def analyze_image_universal(image_path: str, prompt: str, client_type: str):
    """通用圖片分析函式，支援三種 Client"""
    
    # 共用：推導 MIME 類型
    mime_type, _ = mimetypes.guess_type(image_path)
    
    if client_type in ["azure", "openai"]:
        # OpenAI 系列：base64 + Data URI
        with open(image_path, "rb") as f:
            image_base64 = base64.b64encode(f.read()).decode("utf-8")
        
        response = client.chat.completions.create(
            model=model_name,
            messages=[{
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                    {"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{image_base64}"}}
                ]
            }]
        )
        return response.choices[0].message.content
        
    elif client_type == "gemini":
        # Gemini：bytes + inline_data
        with open(image_path, "rb") as f:
            img_bytes = f.read()
        
        response = client.models.generate_content(
            model=model_name,
            contents=[{
                "role": "user",
                "parts": [
                    {"text": prompt},
                    {"inline_data": {"mime_type": mime_type, "data": img_bytes}}
                ]
            }]
        )
        return response.text

重點總結：

Client 建立：Azure 需要多 endpoint + api_version；OpenAI 只需 api_key；Gemini 用 genai.Client
MIME 處理：都用 mimetypes.guess_type() 推導，但格式不同
圖片編碼：OpenAI 系列用 base64；Gemini 直接用 bytes
結果獲取：OpenAI 系列用 .choices[0].message.content；Gemini 用 .text

三種 Client 請求參數詳細比較

OpenAI/Azure – messages 結構

# OpenAI/Azure 共用相同的 messages 格式
{
    "model": "gpt-4o",
    "messages": [                    # 🔑 關鍵字: messages
        {
            "role": "user",          # 🔑 關鍵字: role
            "content": [             # 🔑 關鍵字: content
                {
                    "type": "text",          # 🔑 關鍵字: type
                    "text": "分析這張圖片"     # 🔑 關鍵字: text
                },
                {
                    "type": "image_url",     # 🔑 關鍵字: type + image_url
                    "image_url": {           # 🔑 關鍵字: image_url (巢狀)
                        "url": "data:image/png;base64,..."  # 🔑 關鍵字: url
                    }
                }
            ]
        }
    ]
}

2. Gemini – contents 結構

# Gemini 使用不同的結構和關鍵字
{
    "model": "gemini-2.5-flash",
    "contents": [                    # 🔑 關鍵字: contents (不是 messages)
        {
            "role": "user",          # 🔑 關鍵字: role (這個相同)
            "parts": [               # 🔑 關鍵字: parts (不是 content)
                {
                    "text": "分析這張圖片"     # 🔑 直接用 text (沒有 type)
                },
                {
                    "inline_data": {         # 🔑 關鍵字: inline_data (不是 image_url)
                        "mime_type": "image/png",  # 🔑 關鍵字: mime_type
                        "data": img_bytes          # 🔑 關鍵字: data (不是 url)
                    }
                }
            ]
        }
    ]
}

詳細對比表

實際代碼比較

OpenAI/Azure 的完整結構

openai_request_data = {
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text", 
                    "text": "這是什麼圖片？"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ..."
                    }
                }
            ]
        }
    ],
    "max_tokens": 100
}

Gemini 的完整結構:

genai_request_data = {
    "model": "gemini-2.5-flash",
    "contents": [
        {
            "role": "user",
            "parts": [
                {
                    "text": "這是什麼圖片？"
                },
                {
                    "inline_data": {
                        "mime_type": "image/jpeg",
                        "data": b'\xff\xd8\xff\xe0\x00\x10JFIF...'  # bytes
                    }
                }
            ]
        }
    ],
    "generation_config": {
        "max_output_tokens": 100
    }
}

部分code供比對(AzureOpenAI vs Gemini):

image data與mime_type
放置的位置:

#openai: 
openai_request_data["messages"][0]['content'][1]['image_url']

#genai: 
genai_request_data["contents"][0]['parts'][1]['inline_data']

輸出:

OpenAI將mimi_type嵌入url的value中,value主體為base64 str
google genai則分為mine_type , data兩個key,
data的value為raw bytes

關鍵差異總結

頂層結構
- OpenAI/Azure: messages → 模仿對話訊息的概念
- Gemini: contents → 更廣義的「內容」概念
內容組織
- OpenAI/Azure: content → 單一訊息的內容容器
- Gemini: parts → 一則內容可由多個部分組成（文字、圖片混合）
文字部分
- OpenAI/Azure: 需要 {"type": "text", "text": "..."}
- Gemini: 直接 {"text": "..."}，結構更精簡
圖片部分
- OpenAI/Azure: 使用 image_url，可接受 HTTP[S] URL 或 Data URI (base64)
- Gemini: 使用 inline_data，必須提供 mime_type + base64 的純資料（不支援 URL）

google-genai的參數
max_output_tokens, temperature…
要先放進去 genai.types.GenerateContentConfig()
再傳遞給client.models.generate_content()中的config參數
不能直接放進去client.models.generate_content()中:

import os
from google import genai
import json
from google.genai import types

dir_api = r"D:\test_3"
basename_api = "api.json"
path_api = os.path.join(dir_api, basename_api)

with open(path_api,"r",encoding="utf-8") as f:
    dic_api = json.load(f)
api_key =  dic_api['Gemini api key']
print("api key:\t",api_key[:10] +"...")

client = genai.Client(api_key = api_key)
user_query= "用一句話介紹自己"
"""res = client.models.generate_content()
genai的res 有計算 輸入/輸出 tokens 的 方法嗎?
"""
config = types.GenerateContentConfig(max_output_tokens=3000)
#config = genai.types.GenerateContentConfig(max_output_tokens=300)

model = "gemini-2.5-flash" # "gemma-3n-e2b-it" #

res = client.models.generate_content(
    model = model, 
    contents = [
        {"role": "user", "parts": [{"text": "Hello, who are you?"}]},
        {"role": "model", "parts": [{"text": "I'm Gemini, your AI assistant."}]},
        {"role": "user", "parts": [{"text": user_query}]}
    ],
    config =config
    )
text = res.candidates[0].content.parts[0].text
text2 = res.text
print(text)
print(text2)