タグ内の全データを取得するためのBeautifulSoupとPython

2022-02-26 10:26:33

# -*- coding:utf-8 -*-
# python 2.7
#XiaoDeng
#http://tieba.baidu.com/p/2460150866
#Tag manipulation


from bs4 import BeautifulSoup
import urllib.request
import re


#If it's a URL, you can use this to read the page
#html_doc = "http://tieba.baidu.com/p/2460150866"
#req = urllib.request.Request(html_doc)  
#webpage = urllib.request.urlopen(req)  
#html = webpage.read()



html = """


<タイトル

ヤマネの話




ヤマネの物語




むかしむかし、あるところに3人の姉妹がいました。



<未定義



,



ラシー


 そして



ティリー


;



レイシー



井戸の底に住んでいた。



...



"""
soup = BeautifulSoup(html, 'html.parser') # document object


#find a tag, will only find a a tag
#print(soup.a)#




for k in soup.find_all('a'):
    print(k)
    print(k['class'])# look up the class attribute of the a tag
    print(k['id'])#check the id value of the a tag
    print(k['href'])#check the href value of the a tag
    print(k.string)#check the string of the a tag
    #tag.get('calss'), which can also achieve this effect
    


<未定義





for k in soup.find_all('a'):
    print(k)
    print(k['class'])# look up the class attribute of the a tag
    print(k['id'])#check the id value of the a tag
    print(k['href'])#check the href value of the a tag
    print(k.string)#check the string of the a tag
    #tag.get('calss'), which can also achieve this effect

タグ内の全データを取得するためのBeautifulSoupとPython

関連

Python pipのインストールと使用方法の詳細

PythonのエラーNo module named 'pkg_resources' を解決する。

urlでMax retries exceededの問題を解決しました。

Python ModuleNotFoundError: ConfigParser' という名前のモジュールはありません。

python3.x: urllib2' という名前のモジュールがない

ValueErrorです。変数 `x` と `y` のどちらも数値でないように見えます。

TypeError: Json オブジェクトは str, bytes または bytearray でなければならず、'TextIOWrapper' ではありません。

Pythonのsum関数でTypeError: unsupported operand type(s) for +: 'int' and 'list' エラーを解決する。

Selenium issue IOError: [Errno 2] そのようなファイルまたはディレクトリがありません: 'nul'

ValueError: 解凍に0以上の値が必要

最新

nginxです。[emerg] 0.0.0.0:80 への bind() に失敗しました (98: アドレスは既に使用中です)

htmlページでギリシャ文字を使うには

ピュアhtml+cssでの要素読み込み効果

純粋なhtml + cssで五輪を実現するサンプルコード

ナビゲーションバー・ドロップダウンメニューのHTML+CSSサンプルコード

タイピング効果を実現するピュアhtml+css

htmlの選択ボックスのプレースホルダー作成に関する質問

html css3 伸縮しない画像表示効果

トップナビゲーションバーメニュー作成用HTML+CSS

html+css 実装サイバーパンク風ボタン

おすすめ

AttributeError: 'mywindow' オブジェクトには 'setCentralWidget' という属性がありません。

RuntimeWarning: ログでゼロによる除算に遭遇した

IDLEのサブプロセスが接続されない場合の解決策 - fishcフォーラムより

Numpyライブラリのダウンロードとインストールのまとめ

scipyという名前のモジュールがない場合の解決策|エラー

Pythonエラー解決] 'urllib2'という名前のモジュールがない解決方法

Python Djangoプロジェクトログクエリシステム

TypeError: 'str' と 'int' のインスタンスの間で '<' はサポートされていません。

Pythonのjson.loadsで文字列のデコードに失敗しました。ValueError: JSONオブジェクトをデコードできませんでした

[解決済み] です。TypeError: read() missing 1 required positional argument: 'filename'.