儲蓄保險王 - Page 6 of 164 - 透視各家儲蓄險IRR,踢爆保險黑心貨與偽專家,看見儲蓄險的美麗與陷阱

攝影或3C

Python DOCX 手術刀：精準切片與瘦身 (OOXML 實戰); from lxml import etree; doc_xml = zfin.read(‘word/document.xml’) ; doc_tree = etree.fromstring(doc_xml) ; used_rids = set( doc_tree.xpath( “//@r:embed | //@r:link | //@r:id”, namespaces=ns_map)) #獲取 word/document.xml 有使用的used_rids => 讀取 document.xml.rels 建立白名單 keep_files以及黑名單 rels_to_remove ,要移除的Relationship節點 => 從 XML 樹中移除未使用的 Relationship 節點 => 重寫 Zip (過濾孤兒檔案, 更新document.xml.rels,其他原樣複製)

3 個月 ago

這篇教學將帶您深入 Word (.doc...

攝影或3C

Python Regex 實戰：精準抓取 XML 屬性值 (findall vs finditer 與 Group 的奧義)

3 個月 ago

在處理 XML 或 HTML 字串時，我...

攝影或3C

Python-docx 進階手術室：從底層 XML 到超連結混合技; from docx.opc.constants import RELATIONSHIP_TYPE as RT ; part = doc.part ; r_id = part.relate_to(url, RT.HYPERLINK, is_external=True)

3 個月 ago

在使用 python-docx...

攝影或3C

python-docx 進階手術室：從高階 API 到底層 XML (w:p, w:r, w:t) 完全解析; from docx.oxml import OxmlElement ; from docx.oxml.ns import qn

3 個月 ago

在使用 python-docx...

攝影或3C

Python hashlib 快速入門：為資料建立唯一指紋import hashlib; data = “文字”.encode(‘utf-8’); hash_obj = hashlib.sha256(data); result = hash_obj.hexdigest() #digest:「取得雜湊摘要」或「取得摘要值」

3 個月 ago

什麼是 hashlib？hashlib ...

攝影或3C

告別雜亂 XML！用 Python lxml 實現與 VS Code (Shift+Alt+F) 同級的「完美縮排」; from lxml import etree ; root = etree.fromstring(xml_bytes) #等效 root = etree.fromstring( xml_str.encode(“utf-8”) ); clean_xml_str = etree.tostring(root, pretty_print=True, encoding=’unicode’, xml_declaration=False) ; import xml.etree.ElementTree as ET

3 個月 ago

在處理 Word (.docx) 或 E...

攝影或3C

Python tempfile 模組完全指南：安全管理臨時檔案的最佳實踐; import tempfile ; tempfile.gettempdir() ; tempfile.template ; os.access(temp_dir, os.W_OK) ; with tempfile.NamedTemporaryFile() as tmp: tmp_path = tmp.name #有檔名的臨時檔案 ; with tempfile.TemporaryDirectory() as tmpdir #臨時資料夾 ; with tempfile.TemporaryFile() as tmp #無檔名的臨時檔案

3 個月 ago

為什麼需要 tempfile？在處理檔案...

攝影或3C

Python 讀取 DOCX 圖片關聯：qn+find/findall 與 XPath 的實戰對照 from lxml import etree ; from docx.oxml.ns import qn; lxml.etree._Element.findall( f”.//{ qn(‘a:blip’) }” ) ; .get( qn(“r:embed”) ) #獲取屬性名 ‘r:embed’ 的屬性值(如: ‘rId4’) ; lxml.etree._Element.xpath( “//a:blip/@r:embed”, namespaces = NS) #/@r:embed = 獲取屬性名 ‘r:embed’ 的屬性值(如: ‘rId4’),使用.findall() 要先.findall()獲取List[_Element],再迴圈_Element.get()獲取屬性值, .xpath() 第一個參數path 使用”//a:blip/@r:embed” ,可直接獲取屬性值(List[str]如: [‘rId4’, ‘rId5’]) ; 如何對docx真實移除圖片瘦身?

4 個月 ago

sample.docx 內容僅有一段文字...

攝影或3C

Python 入門教學：用 Path.stat() 與 os.stat() 讀懂檔案資訊; from pathlib import Path ; type(p).name #’WindowsPath’; p.stat().st_size == os.stat(p).st_size == os.path.getsize(p)

4 個月 ago

說明重點放在 .stat()，所有範例...

攝影或3C

Python × lxml.etree：從 Word OOXML 讀、查、改、寫的實戰筆記; from lxml import etree ; parser = etree.XMLParser() ; root = etree.fromstring(xml_str.encode(“utf-8″), parser=parser) #（根節點）; print(etree.tostring(root, encoding=”unicode”, pretty_print=True)) #encoding=”unicode”輸出str ; “utf-8″輸出bytes,漂亮顯示

4 個月 ago

sample.docx 內容(1021 ...

Show more Posts

Show previous Posts