<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Matt Thiessen &#187; scrape</title>
	<atom:link href="http://matt.thiessen.us/tag/scrape/feed/" rel="self" type="application/rss+xml" />
	<link>http://matt.thiessen.us</link>
	<description>Learning to make the hard things easy</description>
	<lastBuildDate>Sun, 05 Feb 2012 04:06:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Using grep to scrape web pages</title>
		<link>http://matt.thiessen.us/2009/09/using-grep-to-scrape-web-pages/</link>
		<comments>http://matt.thiessen.us/2009/09/using-grep-to-scrape-web-pages/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 15:38:20 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[grep]]></category>
		<category><![CDATA[scrape]]></category>

		<guid isPermaLink="false">http://matt.thiessen.us/?p=294</guid>
		<description><![CDATA[In preperation to scrape a number of web pages, I used grep to make a list of URLs I need to scrape.  The list of URLs was in an RSS file. grep -P &#8220;\&#60;link&#62;&#60;\![CDATA\[(.*?)]&#8221; hawkeye_stories.xml &#62; hawkeye_stories_links.txt]]></description>
			<content:encoded><![CDATA[<p>In preperation to scrape a number of web pages, I used grep to make a list of URLs I need to scrape.  The list of URLs was in an RSS file.</p>
<p>grep -P &#8220;\&lt;link&gt;&lt;\![CDATA\[(.*?)]&#8221; hawkeye_stories.xml &gt; hawkeye_stories_links.txt</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.thiessen.us/2009/09/using-grep-to-scrape-web-pages/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

