So why use yahoo pipes for your app? Well there are several reasons. Yahoo Pipes lets you parse websites and rss feeds, combine them, etc. You can take a
website that has no rss feeds and make custome ones for it which is what we will be doing today. You can also modify rss feeds to show up how you want.
Yahoo Pipes are also nice as they are constantly updated. You can make a manual RSS feed, but it kinda defeats the purpose of RSS altogether.
Before we get started, I should
state that I am by no means an expert in Yahoo Pipes and am still learning myself. I am sure there are alot of easier ways to do things but these do work, even if they take a little hacking together.
So lets get started. Browse to pipes.yahoo.com. Register for an account if you don't already have one. Click the big blue 'Create a Pipe' button. You'll be greeted with a blank grid page.
Lovely. So what we're going to start with is a site I recently made some feeds for. On the left side under 'Sources' grab the 'Fetch Page' Module and drag it to your grid.
You'll notice that another module showed up on our grid at the same time called "Pipe Output." This is basically the final destination of our pipe, and we won't mess with it much till we are almost done.
So next we will be putting in the url of the page we want to parse. In this case it will be:
We'll go ahead and drop that in the Fetch Page module under URL.
At the bottom of the screen in the gray box is the 'debugger.' This lets you see your feed as you go so you can make adjustments as you go. Drop the URL in and click refresh.
You should get a page in the debuger that looks like below. If you click the arrow next to the '0' and then next to the content, you can see our page we're grabbing. It doens't look the exactly the same, because of the way the debugger is desplaying it.
What we want to do is 2 things at this point:
1. Narrow down the page we are grabbing so it's only the content we want.
2. Tell pipes how to split up the items in the content.
Now this part will be alot easier if you have some html experience. What we're going to do is look at the html inside the debugger and try to find the first media clip.
We know our first show on that page is ' With The Weight of The World ' (this might have changed depending on when your doing this tutorial.) So lets start by searching for that (ctrl+f).
ok. So what we're looking for is something that hasn't shown up above this point but can be a good start point to start grabbing info. I decided to use '<div class="title_date">. I use this because its between each piece of media so I can use it to split the media as well. This makes it harder to get junk results which you have to filter out later.
Alright so let's drop this into our Fetch Page Module under 'Cut content from' and under 'Split using delimter.'
Lets check our results in the debugger.
As you can see we have server numbers [0-7]. These are each of our media peices. Open up each one and open the content and lets see what we got. If it looks like above, you're in great shape. We are just about done with the first module, we just have one more thing. We need to add in the 'Content to' basically on the last box, to keep our rss from having some junk urls at the bottom. I decided to use 'pager'. I got this from looking at the html on the site and finding a word that was below the media. This might not be the best word to do (For instance, if they have a video about a pager, the rss might cut off early) but for now we'll run with it.