'
Select Elements within Elements:
.. code-block:: pycon
>>> about.find('a')
[, , , , , ]
Search for links within an element:
.. code-block:: pycon
>>> about.absolute_links
{'http://brochure.getpython.info/', 'https://www.python.org/about/gettingstarted/', 'https://www.python.org/about/', 'https://www.python.org/about/quotes/', 'https://www.python.org/about/help/', 'https://www.python.org/about/apps/'}
Search for text on the page:
.. code-block:: pycon
>>> r.html.search('Python is a {} language')[0]
programming
More complex CSS Selector example (copied from Chrome dev tools):
.. code-block:: pycon
>>> r = session.get('https://github.com/')
>>> sel = 'body > div.application-main > div.jumbotron.jumbotron-codelines > div > div > div.col-md-7.text-center.text-md-left > p'
>>> print(r.html.find(sel, first=True).text)
GitHub is a development platform inspired by the way you work. From open source to business, you can host and review code, manage projects, and build software alongside millions of other developers.
XPath is also supported:
.. code-block:: pycon
>>> r.html.xpath('/html/body/div[1]/a')
[]
JavaScript Support
==================
Let's grab some text that's rendered by JavaScript. Until 2020, the Python 2.7 countdown clock (https://pythonclock.org) will serve as a good test page:
.. code-block:: pycon
>>> r = session.get('https://pythonclock.org')
Let's try and see the dynamically rendered code (The countdown clock). To do that quickly at first, we'll search between the last text we see before it ('Python 2.7 will retire in...') and the first text we see after it ('Enable Guido Mode').
.. code-block:: pycon
>>> r.html.search('Python 2.7 will retire in...{}Enable Guido Mode')[0]
'\n \n \n