BeautifulSoup : How to find an element by class

Beautiful Soup, or bs4 is a versatile Python package that makes it easy to automatically parse useful information from web pages as well as HTML and XML files. In order to effectively harvest publicly available data, you'll need to learn how to use Python libraries such as requests and Beautiful Soup properly.

In this article, we will provide hands-on examples and explanations on how to find an element by class in Beautiful Soup.

Scenario : You're trying to scrape quotes from https://quotes.toscrape.com/. In order to do that, you need to locate every div tag that has quote in its class. Below is three different ways to do that in Beautiful Soup.

Find element by class using Beautiful Soup native method

Beautiful Soup has a built-in method that allows you to search for HTML tags by class name. As of Beautiful Soup 4.1.2, you can search by tag's class using the keyword argument class_ like so

from bs4 import BeautifulSoup import requests # Fetch the page and create a Beautiful Soup object page = requests.get("https://quotes.toscrape.com/") soup = BeautifulSoup(page.text, "lxml") # Locate every div tags that has class name of "quote" soup.find_all("div", class_="quote")
Code language: Python (python)

Because class is a reserved keyword used internally by Python itself, you have to use class_ argument instead.

Find element by class using CSS Selector

Alternatively, you can search for HTML tags by class name using a CSS selector with BeautifulSoup select() method.

from bs4 import BeautifulSoup import requests # Fetch the page and create a Beautiful Soup object page = requests.get("https://quotes.toscrape.com/") soup = BeautifulSoup(page.text, "lxml") # Locate every div tags that has "quote" in its name soup.select("div.quote")
Code language: Python (python)

Using the select method allows you to match tags that also have another CSS class other than "quote". For example, if there is a div tag that has the class name "quote toscrape", it would be included in the results, too.

Find element by class using a function

Beautiful Soup allows you to use a function as a filter to search the HTML tree to find the elements you want. We can leverage this to write a lambda function that returns True for every tag that has the name "div", has an attribute named "class" and its class contains "quote".

from bs4 import BeautifulSoup import requests # Fetch the page and create a Beautiful Soup object page = requests.get("https://quotes.toscrape.com/") soup = BeautifulSoup(page.text, "lxml") # Locate every div tags that has "quote" in its name soup.find_all(lambda tag:tag.name=="div" and hasattr(tag, "class") and "quote" in tag['class'])
Code language: Python (python)

Searching using function is particularly useful if there are numerous different elements in the HTML tree or you have a complex requirements. Please note that filtering by function really slows down the performance of Beautiful Soup and should only be used when there's no better option.

We hope that the information above is useful to you. You may be interested in our guide on fixing “pip: command not found” error, “[Errno 32] Broken pipe” in Python and fix “Shadows name from outer scope” in PyCharm.

Click to rate this post!
[Total: 1 Average: 5]

Leave a Comment