Meta developer:
https://developers.facebook.com
我的應用程式 >
建立應用程式 >
其他 > 繼續
消費者>繼續
程式名稱:爬蟲>
建立應用程式
輸入自己FB的密碼>提交
工具>圖形API測試工具
fileds=posts > 提交
(權限至少要打開user_posts)
取得代碼>cURL
複製cURL中的網址:
“https://graph.facebook.com/v18.0/me?fields=posts&access_token=EAA…”
修改為:
https://graph.facebook.com/v18.0/me/posts?access_token=EAA…
#這樣可以少一層”posts” 的key
code:
# -*- coding: utf-8 -*-
"""
Created on Sun Jan 14 10:44:01 2024
@author: SavingKing
"""
import requests as req
import json
curl ="""
https://graph.facebook.com/v18.0/me/posts?access_token=EAA...
"""
curl = curl.strip()
response = req.get(curl)
#<Response [200]>
#requests.models.Response
json_data = json.loads(response.text)
#dict
# print(json_data)
lis_json_data = json_data['data']
lis_msg = []
for dic in lis_json_data:
if 'message' in dic:
msg=dic.get('message')
print(msg)
lis_msg.append(msg)
執行結果:
json_data:
一層一層撥進去json_data:
可將json_data[‘paging’][‘next’]賦值給url_next
url_next = json_data[‘paging’][‘next’]
封裝進去function中
return lis_msg, url_next
code:
# -*- coding: utf-8 -*-
"""
Created on Sun Jan 14 10:44:01 2024
@author: SavingKing
"""
import requests as req
import json
curl ="""
https://graph.facebook.com/v18.0/me/posts?access_token=EAAap...
"""
curl = curl.strip()
def url2lis_msg(curl):
response = req.get(curl)
#<Response [200]>
#requests.models.Response
json_data = json.loads(response.text)
#dict
# print(json_data)
url_next = json_data['paging']['next']
lis_json_data = json_data['data']
lis_msg = []
for dic in lis_json_data:
if 'message' in dic:
msg=dic.get('message')
print(msg)
lis_msg.append(msg)
return lis_msg, url_next
lis_msg, url_next = url2lis_msg(curl)
執行結果:
推薦hahow線上學習python: https://igrape.net/30afN
Meta API的最後一頁:
Meta API的倒數第二頁(格式同其他頁):
為了從json_data作為迴圈終止的條件
將原本的function拆為兩個
code:
# -*- coding: utf-8 -*-
"""
Created on Sun Jan 14 10:44:01 2024
@author: SavingKing
"""
import requests as req
import json
# token = "nc..."
curl ="""
https://graph.facebook.com/v18.0/me/posts?access_token=EAA...
"""
curl = curl.strip()
def url2json_data(curl):
response = req.get(curl)
#<Response [200]>
#requests.models.Response
json_data = json.loads(response.text)
#dict
# print(json_data)
return json_data
def json_data2lis(json_data):
url_next = json_data['paging']['next']
lis_json_data = json_data['data']
lis_msg = []
for dic in lis_json_data:
if 'message' in dic:
msg=dic.get('message')
print(msg)
lis_msg.append(msg)
return lis_msg, url_next
lis_msg = []
while True:
json_data=url2json_data(curl)
if 'paging' not in json_data.keys():
break
lis_msg_tmp, curl = json_data2lis(json_data)
lis_msg.extend(lis_msg_tmp)
dic_msg={"my lis_msg":lis_msg}
with open("my_lis_msg.json","w") as f: #應該要加encoding="UTF-8"
json.dump(dic_msg,f, ensure_ascii=False,indent=4)
輸出的json有一部分資料,
但沒有全部成功
dic_msg={“my lis_msg”:lis_msg}
with open(“my_lis_msg.json”,”w”) as f:
json.dump(dic_msg,f, ensure_ascii=False,indent=4)
Traceback (most recent call last):
Cell In[123], line 3
json.dump(dic_msg,f, ensure_ascii=False,indent=4)
File C:\Python311\Lib\json__init__.py:180 in dump
fp.write(chunk)
UnicodeEncodeError: ‘cp950’ codec can’t encode character ‘\u654e’ in position 101: illegal multibyte sequence
with open(“my_lis_msg.json”,”w”) as f:
#應該要加encoding=”UTF-8″
執行結果:
my_lis_msg.json
推薦hahow線上學習python: https://igrape.net/30afN