Tag Archives: scrape

Using grep to scrape web pages

In preperation to scrape a number of web pages, I used grep to make a list of URLs I need to scrape.  The list of URLs was in an RSS file. grep -P “\<link><\![CDATA\[(.*?)]” hawkeye_stories.xml > hawkeye_stories_links.txt

Posted in Linux | Tagged , | 1 Comment