In Python, the simplest way to get page description with BeautifulSoup is to use the following syntax:
from bs4 import BeautifulSoup
soup = BeautifulSoup(htmlStr, features="html5lib")
# Find description in meta <meta name="description"...
description = soup.find("meta", attrs={'name':'description'})
print (description["content"])
# Find description in meta <meta property="og:description" ...
description = soup.find("meta", property="og:description")
print (description["content"])
To get the description of an HTML page using BeautifulSoup and Python, you typically look for the <meta>
tag with the name
attribute set to "description"
. Here’s a step-by-step guide on how to do this:
If you haven't already, you need to install the BeautifulSoup and Requests libraries. You can do this using pip:
pip install beautifulsoup4 requests
Here’s a sample code snippet that demonstrates how to fetch an HTML page and extract the description:
import requests
from bs4 import BeautifulSoup
# URL of the page you want to scrape
url = 'https://example.com'
# Send a GET request to the URL
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
# Parse the content of the page
soup = BeautifulSoup(response.content, 'html.parser')
# Find the meta description tag
description_tag = soup.find('meta', attrs={'name': 'description'})
# Extract the content attribute if the tag is found
if description_tag and 'content' in description_tag.attrs:
description = description_tag['content']
print("Description:", description)
else:
print("No description found.")
else:
print("Failed to retrieve the page. Status code:", response.status_code)
requests
library to fetch the content of the HTML page.<meta>
tag with name="description"
and extract the content from its content
attribute.Make sure to replace 'https://example.com'
with the actual URL of the page you want to scrape. Also, be mindful of the website's robots.txt
and scraping policies.