Skip to content Skip to sidebar Skip to footer

Is It Possible To Scrape Webpage Without Using Third-party Libraries In Python?

I am trying to understand how beautiful soup works in python. I used beautiful soup,lxml in my past but now trying to implement one script which can read data from given webpage wi

Solution 1:

Third party libraries exist to make your life easier. Yes, of course you could write a program without them (the authors of the libraries had to). However, why reinvent the wheel?

Your best options are beautifulsoup and scrappy. However, if your having trouble with beautifulsoup, I wouldn't try scrappy.

Perhaps you can get by with just the plain text from the website?

from bs4 importBeautifulSoupsoup= BeautifulSoup(html_doc, 'html.parser')
pagetxt = soup.get_text()

Then you can be done with all external libraries and just work with plain text. However, if you need to do something more complicated. HTML is something you really should use a library for manipulating. They is just too much that can go wrong.

Post a Comment for "Is It Possible To Scrape Webpage Without Using Third-party Libraries In Python?"