Python-docx 圖片提取完全指南：從 rId 到二進位資料的探險rid ; part = doc.part.rels[rid].target_part #return part.blob if “ImagePart” in type(part).name else None

Deep Dive into `extract_image_bytes`: How python-docx Handles Images

在處理 Word 自動化時，我們常會遇到一個需求：
「我想要取得文件裡面的這張圖片原始檔」。
但 python-docx 的高層 API (例如 run.picture) 並沒有直接提供 .save() 這樣的方法。

這篇教學將帶您深入 Word 文檔的底層結構 (Relationships與Parts)，
並實作一個能從任何 .docx 檔中「無損提取」圖片的函式。

我們將會經歷以下步驟：

建立測試場景：用 Python 自動在 D:\Temp 生成一個含有圖片的 Word 檔。
剖析原理：解釋 rId (關聯ID) 與 Part (零件) 的關係。
實作核心函式：撰寫 extract_image_bytes。
驗證結果：將提取出來的 bytes 轉回圖片顯示，證明提取成功。

# 1. 環境設定與匯入套件
# 我們需要 docx 來操作 Word，PIL 來生成/驗證圖片，io 來處理記憶體內的二進位流
from docx import Document
from docx.shared import Inches
from PIL import Image, ImageDraw
import io
import os

# 設定測試檔案路徑
temp_dir = r"D:\Temp"
docx_path = os.path.join(temp_dir, "demo_extraction.docx")

# 確保目錄存在
os.makedirs(temp_dir, exist_ok=True)
print(f"工作目錄已準備: {temp_dir}")

2. 製作測試用的 DOCX 檔案

為了確保大家都能跟著做，我們先用程式碼產生一個「含有圖片」的 Word 檔。
這張圖片會是一張紅色背景，上面寫著 “SECRET” 的 PNG 圖。

# 2. 生成測試用的 docx 檔案

# A. 用 PIL 畫一張圖
img = Image.new('RGB', (300, 100), color=(255, 100, 100)) # 紅色背景
d = ImageDraw.Draw(img)
d.text((10, 40), "SECRET IMAGE DATA", fill=(255, 255, 255))

# B. 存入記憶體 (BytesIO)
img_byte_arr = io.BytesIO()
img.save(img_byte_arr, format='PNG')
img_byte_arr.seek(0) # 倒帶回開頭，準備給 python-docx 讀取

# C. 寫入 Word 檔
doc = Document()
doc.add_paragraph("這是一份機密文件，下方藏著一張圖片：")
run = doc.add_paragraph().add_run()
run.add_picture(img_byte_arr, width=Inches(3.0)) # 插入圖片
doc.add_paragraph("圖片已插入完畢。")

doc.save(docx_path)
print(f"已生成測試文件: {docx_path}")

demo_extraction.docx:

3. 核心解密：`extract_image_bytes` 函式

這是本教學的重點。在 docx 的 XML 結構中，
圖片並不是直接嵌在文字旁邊的，
而是存放在一個獨立的資料夾 (word/media/)，
並透過 rId (Relationship ID) 來連結。

流程如下：

Tag (<w:drawing> 與 <a:blip>)：
- <w:drawing> 是外層容器，代表這裡有一個「繪圖物件」（可能含圖片、圖表或文字方塊）。
- 真正藏著 rId 的是內層的 <a:blip r:embed="rId7"> (BiLIP – Binary Large Image Picture) 標籤。
- 程式必須先找到 drawing，再往裡面挖到 blip，才能拿到那是哪張圖的代號。
Rels (.rels)：DocumentPart (主文件) 有一張對照表 (Relationships)，查表可知 rId7 指向哪一個檔案零件。
Part (ImagePart)：找到該零件後，它就是一個存放二進位資料的物件。
Blob (.blob)：這個屬性就是圖片真正的 Raw Data。#後面有簡潔版

from typing import Optional
from docx.document import Document as DocxDocument

def extract_image_bytes(doc: DocxDocument, rid: str) -> Optional[bytes]:
    """
    用 rId 從 document part relationships 找出對應的圖片零件 (Part)，再取出內容。
    
    參數:
    - doc: DocxDocument 物件
    - rid: 關係 ID (如 'rId4')
    
    回傳:
    - bytes: 圖片的原始二進位資料
    """
    
    # 1. 取得關聯與零件
    # doc.part.rels 實際上是 `docx.opc.rel.Relationships` 類別的實例
    # 它不是 Python 原生的 dict，但它是「像字典一樣的物件」 (Dict-like Object)
    # 所以我們可以用 `in` 來檢查 key，也可以用 `[]` 來取值
    """
    {'rId3': <docx.opc.rel._Relationship at 0x1ec41ac0250>,
     'rId4': <docx.opc.rel._Relationship at 0x1ec41ac2950>,
     'rId5': <docx.opc.rel._Relationship at 0x1ec41ac00d0>,
     'rId6': <docx.opc.rel._Relationship at 0x1ec41ac0150>,
     'rId7': <docx.opc.rel._Relationship at 0x1ec41ac01d0>,
     'rId8': <docx.opc.rel._Relationship at 0x1ec41ac0390>,
     'rId1': <docx.opc.rel._Relationship at 0x1ec41ac0490>,
     'rId2': <docx.opc.rel._Relationship at 0x1ec41ac0290>,
     'rId9': <docx.opc.rel._Relationship at 0x1ec41ac0690>}
    """
    if rid not in doc.part.rels:
        return None
        
    rel = doc.part.rels[rid]
    """ docx.opc.rel._Relationship
    vars(rel) or rel.__dict__
    
    {'_rId': 'rId9',
     '_reltype': 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/image',
     '_target': <docx.parts.image.ImagePart at 0x1ec41ab5810>,
     '_baseURI': '/word',
     '_is_external': False}
    """
    
    # 透過 relationship 找到目標零件 (Target Part)
    # 這裡的 target_part 通常是 docx.parts.image.ImagePart 類別的實例
    target_part = rel.target_part
    """為什麼使用 rel.target_part ,而非 rel.target
    確保拿到物件：.target_part 是一個公開的屬性 (Property)，
    它的工作就是「掛保證」。不管底層現在存的是字串還是還沒初始化的東西，
    它會負責弄出一個完整的 Part 物件給您。
    避免碰到內部實作：底層的 _target 是內部實作細節，未來可能會改名或改變行為，
    但 .target_part 是對外的承諾介面，使用它最安全穩定。
    使用 dir(rel) 就可以看到 target_part 這個屬性
    
    vars(target_part) or target_part.__dict__
    {'_partname': '/word/media/image1.png',
     '_content_type': 'image/png',
     '_blob': b'\x89PNG\r\n\x1a\n\x00\x00...',
     '_package': None,
     '_image': None,
     '_rels': {},
     'rels': {}}
    """
    
    # 讓我們印出來看看這是什麼東西 (教學用)
    print(f"[Debug] rId={rid}")
    print(f"       -> 對應到 PartName: {target_part.partname}")
    print(f"       -> 內容類型 ContentType: {target_part.content_type}")

    # [重要] 安全機制：檢查這是不是真的圖片
    # 有時候 relationship 會指向註腳 (footnotes) 或樣式表，那些也是 XML 但不是圖片
    if "image" not in target_part.content_type: #'image/png'
        #type(target_part) #docx.parts.image.ImagePart
        print(f"       -> [警告] 這不是圖片，略過 extract。")
        return None
    
    # 2. 取出資料
    # 使用 .blob (Binary Large Object) 屬性取出二進位資料
    #
    # Q: 為什麼要用 getattr(target_part, 'blob', None) 而不是直接 target_part.blob ?
    # A: 這是一種防禦性寫法。
    #    雖然理論上 ImagePart 一定有 .blob，但若是這份文件的部分零件損毀，或 python-docx 版本差異，
    #    直接用 .blob 可能會在屬性不存在時拋出 AttributeError 導致程式崩潰。
    #    getattr() 允許我們設定一個預設值 (None)，當屬性找不到時優雅地回傳 None 讓我們處理。
    return getattr(target_part, 'blob', None)

📚 什麼是 EAFP？

EAFP 是 Python 社群非常核心的設計哲學，全名是 “Easier to Ask for Forgiveness than Permission”。

如果用中文最傳神的翻譯，就是：

「先斬後奏」
(與其事前請求許可，不如做錯了再求原諒)

兩派風格比較：

LBYL (Look Before You Leap) – 「三思而後行」
- 這是 C / Java 等語言常見的風格。
- 特色：在執行動作前，先做一堆 if 檢查。
- 程式碼：
  python if rid in doc.part.rels: # 1. 先確認 key 在不在 rel = doc.part.rels[rid] if hasattr(rel, 'target_part'): # 2. 再確認屬性有沒有 # ... 才敢執行
EAFP – 「先斬後奏」
- 這是 Python 的風格。
- 特色：假設大部分情況都會成功，直接執行！若出錯了再用 try...except 來補救。
- 程式碼：
  python try: return doc.part.rels[rid].target_part.blob # 直接拿！ except (KeyError, AttributeError): # 拿不到再說 return None

為什麼 Python 喜歡 EAFP？
因為 Python 處理 try...except 的速度非常快，而且程式碼通常會比較乾淨，閱讀時可以直接看到「主要邏輯」，而不會被一堆「防禦用的 if」干擾視線。

⚡ 進階討論：更 Pythonic 的 EAFP 寫法？

用 try-except 確實是 Python 社群很推崇的 EAFP 風格 (Easier to Ask for Forgiveness than Permission)，程式碼會變得非常簡潔。
Easier to Ask for Forgiveness than Permission
翻譯：「請求原諒」比「請求許可」更容易。
中文神韻：先斬後奏。也就是「預設它會成功，直接執行；真的出錯了再來收拾殘局 (catch exception)」

如果我們把原本 10 幾行的函式濃縮，真的可以寫成這樣：

def extract_image_bytes_concise(doc, rid):
    try:
        # 一行串接：查表 -> 找零件 -> 取內容
        return doc.part.rels[rid].target_part.blob
    except Exception:
        return None

這段程式碼完全可以運作！不過在實務上，我們之所以保留上面比較「囉嗦」的寫法，主要是因為一個隱藏的陷阱：

⚠️ 陷阱：不是只有圖片才有 .blob！
在 docx 結構中，Header (頁首)、Footer (頁尾)、甚至 Styles (樣式表) 也都是 Part 物件，它們也都擁有 .blob 屬性 (內容是 XML 文字)。

如果我們略過了 content_type 的檢查，當這個 relationships (rId) 剛好指向一個「超連結」或「樣式表」時，這個簡潔版函式會快樂地回傳一堆 XML 原始碼 (Bytes)，導致後續存成 .jpg 時圖檔損毀打不開。

結論：

Debug / 探索階段：用 try-except 快速取值很方便。
正式功能 / 自動化：建議保留 content_type 檢查，確保拿到的真的是圖片。

from typing import Optional
from docx.document import Document as DocxDocument

def extract_image_bytes(doc: DocxDocument, rid: str) -> Optional[bytes]:
    """
    簡潔版：嘗試透過 rId 取得圖片 Bytes。
    使用 EAFP (Easier to Ask for Forgiveness than Permission) 風格。
    """
    try:
        # 1. 直接取得目標零件
        part = doc.part.rels[rid].target_part
        #docx.parts.image.ImagePart
                
        # 2. 為了不誤拿 xml 當圖片，我們檢查它的類別名稱
        # (如果在外面檢查 bytes header 會比在這裡檢查 content_type 麻煩很多)
        # 這樣就不需要背 content_type 是 'image/png' 還是 'image/jpeg' 了，
        # 只要確認它是 "ImagePart" (圖片零件) 即可，直觀很多！
        #if "image" not in part.content_type: #'image/png'
        if "ImagePart" not in type(part).__name__: #'ImagePart'
            #str(type(part)) #"<class 'docx.parts.image.ImagePart'>"
            return None
            
        # 3. 回傳資料
        return part.blob
        
    except (KeyError, AttributeError):
        # rId 不存在，或是該關聯沒有零件 (Target is None)
        return None

[終極簡潔版]

from typing import Optional
from docx.document import Document as DocxDocument

def extract_image_bytes(doc: DocxDocument, rid: str) -> Optional[bytes]:
    """
    簡潔版：嘗試透過 rId 取得圖片 Bytes。
    使用 EAFP (Easier to Ask for Forgiveness than Permission) 風格。
    """
    try:
        # P.S. 網友提問：能否寫成 part = ... if "ImagePart" in type(part) ... ?
        # 答案：不行。因為 Python 是先執行等號右邊，當時左邊的 part 還沒被賦值 (還不存在)，會報錯。
        
        # 1. 先把物件拿出來
        part = doc.part.rels[rid].target_part
        
        # 2. [終極簡潔版] 一行搞定判斷與回傳
        #邏輯：如果是圖片類別 (ImagePart) 就回傳 .blob，否則回傳 None
        return part.blob if "ImagePart" in type(part).__name__ else None
        # BLOB (Binary Large Object) - 也就是「一坨」像史萊姆的無定形團塊
        # 用來形容這些人類看不懂的 raw bytes 內容非常傳神
        
    except (KeyError, AttributeError):
        # rId 不存在，或是該關聯沒有零件 (Target is None)
        return None

🕵️‍♂️ 補充：如果是 Header 或 Footer，會是什麼 Part？

我們剛剛提到 rId 有時候會指向非圖片的物件。除了 ImagePart，這裡列出幾個最常見的「偽裝者」：

圖片：docx.parts.image.ImagePart
- Content-Type: image/png, image/jpeg …
- 內容：二進位圖片資料 (.blob)
頁首 (Header)：docx.parts.hdrftr.HeaderPart
- Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.header+xml
- 內容：XML 文件 (描述頁首的文字與排版)
頁尾 (Footer)：docx.parts.hdrftr.FooterPart
- Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.footer+xml
- 內容：XML 文件

這就是為什麼我們雖然用 EAFP 很帥氣地直接拿 .blob，但一定要補一句檢查 content_type 的原因。如果不檢查，你可能會把 HeaderPart 的 XML 當成圖片抓下來，存成一個壞掉的 .png。

🔍 深入解析：`Relationships` vs `_Relationship`

這兩個類別雖然只差一個底線和單複數，但在架構上的意義完全不同：

docx.opc.rel.Relationships (複數，無底線)
- 角色：管理者 / 容器 (Container)。
- 說明：這就是 doc.part.rels 本身。它是一個公開的類別，負責像字典一樣管理整群的關係。
docx.opc.rel._Relationship (單數，有底線 _)
- 角色：被管理的個體 (Item)。
- 說明：這是當我們寫 rel = doc.part.rels['rId1'] 時拿到的物件。
- 為什麼有底線？：在 Python 慣例中，開頭是 _ 的類別通常代表 「內部實作 (Internal API)」。
  - 這意味著套件作者不建議我們自己去 new 一個 _Relationship()。
  - 即使我們拿到了它的實例 (Instance) 可以使用，但它的建立和銷毀應該完全由 Relationships 容器來全權負責。

4. 尋找圖片的 rId

有了提取函式還不夠，我們得先知道「哪裡有圖片」。
這需要深入 XML 節點尋找 <a:blip> 標籤。

以下程式碼示範如何跑遍整份文件，找出所有圖片的 rId。

from docx.oxml.ns import qn

# 讀取我們剛剛做好的文件
doc = Document(docx_path)

found_rids = []

# 遍歷所有段落和 Run
for p in doc.paragraphs:
    for run in p.runs:
        # 檢查這個 Run 的 XML 裡面有沒有 <w:drawing> (圖片通常包在這個標籤裡)
        if 'w:drawing' in run._element.xml:
            # 使用 XPath 找出底下的 <a:blip> 標籤
            # namespace 注意: 'a' 通常代表 main drawing namespace
            blips = run._element.xpath(".//a:blip")
            for blip in blips:
                # 取得 r:embed 屬性，這就是 rId
                rid = blip.get(qn("r:embed"))
                if rid:
                    print(f"找到圖片參考！ rId: {rid}")
                    found_rids.append(rid)

print(f"總共找到 {len(found_rids)} 個圖片參照。")

5. 實際提取與驗證

最後一步，我們使用剛剛寫好的 extract_image_bytes，把找到的 rId 傳進去，看看拿出來的 bytes 能不能還原回原本的 “SECRET” 圖片。

# 5. 驗證結果
if found_rids:
    target_rid = found_rids[0] # 取第一個找到的
    
    # === 使用我們的核心函式 ===
    image_data = extract_image_bytes(doc, target_rid)
    # ========================
    
    if image_data:
        print(f"\n成功提取出 {len(image_data)} bytes 的資料！")
        
        # 用 PIL 讀取這些 bytes，看看是不是我們剛剛畫的那張圖
        extracted_img = Image.open(io.BytesIO(image_data))
        
        print("提取出的圖片預覽：")
        display(extracted_img) # 在 Jupyter 顯示圖片
    else:
        print("提取失敗，回傳為 None")
else:
    print("沒有找到任何圖片 rId，無法測試。")

💡 為什麼 `xpath(".//a:blip")` 可以直接用？不用傳字典？

這是一個非常好的觀察！

在標準的 lxml 函式庫中，如果 XML 有 namespaces，通常我們必須這樣寫，非常繁瑣：

# 標準 lxml 寫法 (需要自己定義字典)
namespaces = {
    'a': 'http://schemas.openxmlformats.org/drawingml/2006/main',
    'r': 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'
}
# 每次呼叫都要傳入 namespaces
blips = element.xpath(".//a:blip", namespaces=namespaces)

或者被迫用 local-name() 來繞過 namespace 檢查：

# 繞過 namespaces 的寫法 (雖通用但寫法較長)
blips = element.xpath(".//*[local-name()='blip']")

但在 python-docx 裡，所有我們操作的 XML 元素 (如 run._element) 實際上都是 BaseOxmlElement 的實例。
這個類別的 .xpath() 方法被改寫過了，它內建了一份全域的 Namespace 字典 (包含常用的 w, a, r, wp 等前綴)。

所以當你寫 a:blip 時，python-docx 已經在背後自動幫你把 'a' 翻譯成 'http://schemas.openxmlformats.org/drawingml/2006/main' 了。這就是為什麼語法可以保持這麼簡潔，而我們在 nsmap 屬性中卻看不到這些定義的原因 (因為它們被註冊在 Python 程式碼的全域設定裡，而不是寫死在單一 XML 節點上)。

5. 實際提取與驗證

最後一步，我們使用剛剛寫好的 extract_image_bytes，把找到的 rId 傳進去，看看拿出來的 bytes 能不能還原回原本的 “SECRET” 圖片。

# 5. 驗證結果
if found_rids:
    target_rid = found_rids[0] # 取第一個找到的
    
    # === 使用我們的核心函式 ===
    image_data = extract_image_bytes(doc, target_rid)
    # ========================
    
    if image_data:
        print(f"\n成功提取出 {len(image_data)} bytes 的資料！")
        
        # 用 PIL 讀取這些 bytes，看看是不是我們剛剛畫的那張圖
        extracted_img = Image.open(io.BytesIO(image_data))
        
        print("提取出的圖片預覽：")
        display(extracted_img) # 在 Jupyter 顯示圖片
    else:
        print("提取失敗，回傳為 None")
else:
    print("沒有找到任何圖片 rId，無法測試。")