1. ホーム
  2. パイソン

BeautifulSoup <p> 中去掉 <br/> 获取文本内容

2022-02-26 05:02:43


Data

>>> type(ips)
<class 'bs4.element.Tag'>
>>> print ips

64.158.31.142:

3128 United States Bloomfield, Colorado Level3 Communications<br/>42.104.84.107:8080 India Non-Continental<br/>110.37.216.6:8080 Pakistan Non-Continental<br/>54.70.50.55:3128 United States New Jersey ( Merck)<br/>182.253.121.33:8080 Indonesia Non-Continental<br/></p>

コード

>>> type(ips.find_all(text=True))
<class 'bs4.element.ResultSet'>
>>> res = ips.find_all(text=True)
>>> for str in res:
    print str

117.4.136.145:8080 Vietnam Non-continental
188.166.83.6:1080 Russia Non-Continental
138.197.157.44:1080 United States Non-Continental
83.56.123.0:3128 Spain Non-Continental
183.89.210.22:8080 Thailand Non-Continental
111.62.243.64:80 China Mobile

または

>>> for str in ips.descendants:
    if type(str) == type(ips):
        None
    else:
        print str.string

117.4.136.145:8080 Vietnam Non-continental
188.166.83.6:1080 Russia Non-Continental
138.197.157.44:1080 United States Non-Continental
83.56.123.0:3128 Spain Non-Continental
183.89.210.22:8080 Thailand Non-Continental
111.62.243.64:80 China Mobile

参考文献

1. Beautifulsoupのドキュメント

2. ネットの使い方を知っている

3. アーシング・アンクローズド <br> タグをBeautifulSoupで