Scraping bot
Posted by Eumir Gaspar on July 8, 2008
Filed Under Ruby on Rails, Web Development | Leave a Comment
I just recently found the joys of scraping. Yes, the “technique” is a bit underhanded, but it sure is fun. I guess as long as you don’t profit off of it, it’s ok(and it doesn’t affect the site you’re scraping). If you scraped a site just to present it in a different design(like scrape all the news off CNN and then make a site called BNN or something and then charged users with a smaller fee), then yeah, that’s just plain dirty.
I scrape for fun and that’s it. Anyways, there I was, scraping movie schedules off the only movie schedule site that I know of( clickthecity.com ) and suddenly, I had an idea. What if I just made an RSS feed for a movie house that I liked? It would be just so cool since clickthecity(or CTC) doesn’t have an RSS feed. So I started researching about makign RSS feed, and while doing so, Topher brought about the idea of making a bot. He showed me how jabberbot and from there I went on and played around with it.
To make the scraper, I used the Rubyful-soup plugin for Ruby on Rails(along with Mechanize). Using it was easy, and the main problem actually was how to traverse the atrocious table layout of clickthecity. Once I got over that hurdle though, getting the info I wanted was easy. And what did I need? I only just needed the date of the showtimes(so I’ll know if it was the latest update), the location of the cinemas(the mall the cinemas were located), the cinema name(or number), the movies being shown on each movie house, and the show times of each. These were saved into a database I made(I had a bit of a trouble in putting them in, especially since there were movie houses that had two different movies being shown the same day)
Next was the bot itself. Like I said, I used jabberbot, which was also a handy plugin for Ruby on Rails. The syntax was easy, and this was the easiest part. I made a new google account just for the bot and now it is up. I still have yet to make it work on yahoo messenger(Topher tried it and it didn’t work, but I haven’t tried it for myself yet) so I’ll just have to settle with Gtalk.
The flow of the program is simple: scrape the data from clickthecity.com and then save it to the database(done using a rake task I made). I have yet to automate these tasks, along with the connection of the bot(I still have to manually start it using another rake task). Anyways, after manually callign the rake tasks for scraping the site and starting the bot, movieschedule-bot is ready to go.
So far, I only added a select few movie houses that me and my friends usually go to. You can actually add movieschedules (yes, google mail) to your gtalk/gmail contact list and start using it(don’t worry about down times. that just means I am currently updating/restarting it - try again after five minutes if it doesn’t work) Don’t try to spam it though as right now it is running from my local machine. I have yet to upload it to my free Heroku account and make it run the bot forever.
So there you have it. A scraper bot. There are a LOT of other uses for this, like a thesaurus bot, wikipedia bot, imdb bot or something but that’s for another discussion. As you can see, bots can be useful too! Right now, movieschedules is looking for more friends. Be kind to it!
To start using movieschedules, just add it to your Google contact list then type “help” and send it. It will return you a list of commands that are currently available. Have fun!
Railsconf 2008
Posted by Topher Rigor on May 12, 2008
Filed Under Ruby on Rails | Leave a Comment
I will be at Railsconf 2008 at Portland. I still haven’t finalized the sessions I’m attending but I’m sure it will be great.
Alphabar on Rails
Posted by Gerald Abrencillo on April 24, 2008
Filed Under Bugs and Fixes, Ruby on Rails | Leave a Comment
I am working on this little project right now using Ruby on Rails and I ran into some trouble with the alphabar plugin. First, alphabar uses the with_scope method which, since Rails 2.0 I believe, has been protected, giving me some errors since I froze my Rails version. I was able to overcome this obstacle by using the send method instead. Just change the last part of the find method of the plugin from:
model.with_scope({:find => {:conditions => conditions}})
{model.find :all}
to:
model.send(:with_scope, {:find => {:conditions => conditions}})
{model.find :all}
After I finally got that to work the plugin was very useful, although the alphabar was limited to letters and blank which is useful if you don’t need numbers. To accomodate numbers just add this to the alphabar helper:
('0'..'9').to_a.each do |i|
slots << i
end
Theoretically it should work for any character but I haven’t tried it yet.
So there you have it. Hoepfully somebody having trouble with this plugin will find this post useful.
PNOI Judge
Posted by Topher Rigor on March 13, 2008
Filed Under Uncategorized | Leave a Comment
The first Philippine National Olympiad in Informatics (PNOI 2008) will be held on March 15, 2008 at Ateneo de Manila University, Loyola Heights, Quezon City.
PNOI is an individual programming contest for high school students. I’m a judge (and 4 others) at this contest.
Click here for the details.
Run from Capistrano
Posted by Topher Rigor on March 8, 2008
Filed Under Ruby on Rails | Leave a Comment
You can run shell commands from capistrano using the run method. For example, to create a symlink from the shared directory to the current directory, you’ll write
run "ln -nfs #{deploy_to}/shared/config/database.yml #{release_path}/config/database.yml"
run can also take a block like this
run "rsync /path/to/file host:/path/to/file" do |channel, stream, text|
logger.info "[#{stream}] #{text}"
output = case text
when /\bpassword.*:/i
"#{password}\n"
when %r{\(yes/no\)}
"yes\n"
end
channel.send_data(output) if output
end
When you run a command it might ask for your password. To handle this in capistrano, use send_data to send a response.
Check lib/capistrano/recipes/deploy/scm/subversion.rb on the capistrano gem to see how capistrano handles subversion commands.
The problem of capital
Posted by Bit Santos on February 8, 2008
Filed Under Business, Startup Life | 1 Comment
Every business startup encounters the same problem at some point: the problem of money or to be more precise, where to get it.
A startup by convention is defined as a business that aims to create and sell its own product/s, but such a business needs resources to just survive while everyone is still developing the product. These resources are what business and management types like to call “capital”. (Just a bit of sarcasm there, folks.)
Here in the Philippines, the general trend (from my own unscientific observations) for successful startups is that the founders are usually already rich to begin with or are backed by a rich family and thus don’t have to deal with the problem of capital as much as others. The government has done quite a lot to try and help SMEs and micro-enterprises such as provide tax breaks or even loans to those who want to enter particular industries. Unfortunately for us techies, the IT industry isn’t on the list.
Besides the lack of institutional support, we don’t have much of a VC culture to really talk about. Local business investors - as far as I can see - are not willing to invest in any high-risk ventures such as a software company despite the potentially high rewards. I read somewhere that the VCs in Silicon Valley have already understood how to handle this risk and I believe their approach makes sense and should be emulated if we are to even hope that we can start something similar here. Although the rate of Internet startup success is very low, even if just one of their investments hit it big, it usually hits big enough to more than make up for the ones that don’t make it.
How do we deal with it? By taking on client work. The problem here is the very basic concept that the more time you spend on something, the less time you have for something else. Time spent on clients is time not being spent on your own product. In an interview at Pinoy Web Startup, Luis of syndeo::media admitted that although their ultimate goal is to operate solely on their own products, they’re spending 95% of their time on client work just to get by. Although I’m sure that was just a rough estimate, if we were to calculate that in terms of a 40-hour work-week per programmer, that would work out to just 2 hours of work on their own products every week. (Of course we can probably safely assume that as in most startups, work-weeks aren’t exactly fixed at 40 hours every week.)
While I won’t question their approach (we here at Admoo Labs also seriously considered this approach), I must admit that it feels like a waste to be doing that when the real work towards your ultimate aim ultimately accounts for only 5% of your time but the truth of it is, there’s very little choice for a lot of us. We need the money.
Everything would be so much simpler if we had more funding sources in the region like Y-Combinator and its many clones…
RailsConf 2007 Keynote Videos
Posted by Topher Rigor on February 3, 2008
Filed Under Ruby on Rails | Leave a Comment
The keynote videos from RailsConf 2007 are now available online. Get them here.
DHH’s presentation is available here. DHH made a live demo showing REST and made a mistake. He said he wanted to show how easy it is to debug in Rails. I say the same thing when something goes wrong in my live demos. ![]()
nl2br in RoR
Posted by Gerald Abrencillo on January 31, 2008
Filed Under Bugs and Fixes, Ruby on Rails | 4 Comments
I recently found out that line breaks are not recognized when displaying a large block of text on the view (e.g. a post body). In PHP this was handled by the nl2br method but I had no luck finding a similar function in RoR. So I did what any sane person would do, create one myself. I placed this method in our model to format a particular field but it seems a better choice to put it in a helper.
def nl2br(text)
return text.gsub(/\n/, ‘<br/>’)
end
And there you have it, your very own nl2br method. Either you just found something really useful or I just wasted 2 minutes of your time. If anyone has a better solution or if I missed a function for this let us know by dropping a comment.
Small But Terrible
Posted by Eumir Gaspar on January 31, 2008
Filed Under Bugs and Fixes, Web Development | Leave a Comment
I’m pretty sure most of us have been in a situation where we have been trying to fix a bug for hours and finally seeing that there had just been some tiny typo or maybe even a commented line of code. I’ve been through a lot of those experiences before, especially when I was still learning to code, and most of the time, I just breathe out a huge sigh of relied or just plainly laugh at myself.
Reminiscing about those ‘good old days’ could be fun too. Here’s some of those ‘what the hell/oh come on/I knew it was just a simple mistake!’ scenarios that I have experienced or just know of:
- A missing/extra space, comma, or whatever character in your code. When this happens, it usually takes a while before you realize it because most of the time, you assume that what you typed was correct. Especially with missing spaces, it really is hard to tell. I have had one experience before in coding in Flash, where my fellow coder had 3 extra spaces after the instance name of a clip. No wonder we couldn’t access the variable!
- Unpopulated database. I think Bit had experienced this too, when he was launching the AnimoAteneo.com site. I have also experienced this recently, where I KNEW my code was correct, but the reason why it wasn’t printing out anything was because it didn’t select anything(I was trying to select events during a week, which was stored in database…and the problem was there were no events that week).
- The semi-colon of emptiness. There are rare times in the life of a programmer(especially ones that still use the ancient arts of Java and other compiled languages) when s/he encounters the…wait for it…semi-colon of emptiness(SFX: thunder crash). This is one of the most frustrating things that can happen to you: accidentally put a semi-colon after an if statement. Although it has been used deliberately for empty if statements, some still accidentally put a semi-colon after the if statement, which invariably makes it useless. And for that, we have to thank those other languages that don’t require a semi-colon.
- HTML entities. These guys are a nightmare, especially with a database involved. tick-marks or whatever you call it are often confused with apostrophes, tabs(nbsps) with spaces, long dashes with short dashes(sounds weird but it does happen), etc. It is a real pain especially when the program or database doesn’t complain about it and you just find out about it after they have wreaked havoc in whatever you were working on.
- Uncommented/commented lines of code. This happens especially when you are using the “try and try ’til it works” approach in coding. This also happens when you are trying to debug your code and you forget to uncomment/comment some lines that subtly affect whatever your code was doing - like, say - assigning a static/hard-coded value to a variable.
These are just some of the mistakes that we can encounter while coding and minimizing the chances of these happening can help in coding a lot faster and having more time dealing with bugs that matter.
January 2008 PhRUG Meet-up
Posted by Topher Rigor on January 28, 2008
Filed Under Ruby on Rails | Leave a Comment
A PhRUG meet-up was held on Jan 16, 2008 at Ateneo de Manila University sponsored by Admoo Labs. 30 people attended the event. I gave a presentation on ‘Website Development with Ruby on Rails’. Greg Moreno talked about RSpec.You can download a copy of my presentation here. Pictures are here.
