Ok, so next we will be adding a "Rename" function to our pipe. This will let us take the content we have that has all sort of information in it (link, description, title) and then filter them each down to the info each one needs.
Click on Operators on the left side and drag the Rename Module to our grid. Now lets connect our two modules. Simply click the ball on the bottom of our Fetch Page module and drag it to the top of the Rename Module.
Now let's click the '+' next to Mappings on the Rename module 3 times so you should have a total of 4 rows of boxes. In each of the left side boxes, we will be selecting 'item.content' from the list. Then change the first 3 rows from 'rename' to 'copy as' We will just rename the last one.
Finally, in the final four boxes, we will put the seperate elements we are going to need for our rss feed: title, link, description, media:thumbnail.

Now its time for the tricky part. Still under Operators, drag the Regex Module to your grid and connect your Rename Module to it.
If you click on the Regex module, it should refresh and we will be able to see the results from our Rename module in the debugger.

Notice that it has replaced the numbers with a all the html we've ripped. Well that simply won't do, so lets start narrowing down our title to just the title.
On the Regex module, lets select item.title from the first drop down. So lets search for the title in the html. I find several instances of it, but the easiest one to grab looks like one that comes right after 'title="'. It's the easies, because remember, your trying to find something 'unique' for that area of the page we've ripped. In our replace box for title we're going to put in .*title="
Now lets break this down:
.* - means everything before
title=" - is the item we found that we want to grab what's after.
ok the 'with' box is just going to stay empty, cuz we don't actually want to replace the text we've selected with anything, we just want to remove it. Lastly, check the 4 checkboxes at the end of the tutorial. I'm not sure exactly what they do yet, but I know you can't replace stuff with blank space unless you have them checked. Lets refresh our debugger and take a look.

Excellent. We now have the title at the beginning of each of our pipe titles. Now lets clip of the rest of the page. Lets click the + next to Rules in Regex and select item.title again. For replace we will put ">.*
"> - again what we found on our page
.* - This used at the end means everything after the selection.
Again we will replace it with blank space and don't forget to check the 4 check boxes and let's refresh.

Looking Good! We now have a title for each of our feeds! So now it's on to shrink the url, the media, and the description down. This can take some practice and some trial and error. I'm going to just show you what I did to get the feeds correct.
| In |
item.description |
Replace |
.*<div class="description"> |
With |
|
| In |
item.description |
Replace |
Play Full Episode.* |
With |
|
| In |
item.link |
Replace |
.*href="http://www.spike.com/full-episode/ |
With |
http://www.spike.com/full-episode/ |
| In |
item.link |
Replace |
" class=".* |
With |
|
| In |
item.media:thumbnail |
Replace |
\?width=220".* |
With |
|
| In |
item.media:thumbnail |
Replace |
.*img src=" |
With |
|
Only a few things to note here: the \? lets regex know that its looking for an actual question mark and not using it differently. You would do the same thing if there was a * in your html by doing \*
On the first item.link i actually put something in the 'with' column. That's because 'href="' was very common on the page and i wanted to only grab the link that went to the video, so I put some of the url into the replace.
If you have any questions or if you find a better way, hit me up in the boxee forums.
Time to show you how to finish up the feed.

The next module will be under operators and called Create Rss. This is a pretty straight forward app, just puts your content to ripped into the right sections. Connect your Regex module to it and lets get going.
Under title select item.title
Under description select item.description
Under link select item.link
Under media:thumbnail -> url select item.media:thumbnail

Once your done with that's it's time to connect our create rss to the Pipe Output module and test our rss.
click save pipe, put in a name that you'll remember, and click 'Run Pipe' at the top.
You should get a page that is simliar to this one:

I might have a few more links on my page as I added a couple more pages to rip. (just add more 'fetch pages', and then connect them with a 'Union' Module before they hit the Rename.)
Okay, now we can just click "Get as RSS' at the top to get the the URL of our rss feed and add on the boxee.tv website under feeds to test it. Everything working great? Good! then it's time to create your RSS app!
|