» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with User:alex + del.icio.us

Del.icio.us is not updating their feeds again

If you are using LiveMarks or the tag digest page, the reason it’s empty is that del.icio.us has stopped updating its RSS feeds.

This happens a lot and there’s not much I can do about it, just have to wait till they turn the feeds back on again, it’s usually not much longer than a couple days.

Here’s a traffic graph of del.icio.us

User:alex: Alex Bosworth - The Races

Del.icio.us Tools

Even though I’ve moved off of del.icio.us and onto my encrypted web bookmarks manager boz, I still have hundreds of bookmarks saved on del.icio.us, and I use the site for other things like a handful of projects that use the site’s api to do some interesting stuff.

I’m adding a new one to the list today, and updating an old one to start Delicious Tools.

Delicious Tools includes:
  • A dead link checker – lots of your urls are out of date or broken: check that your bookmarks are still valid.
  • Google sync – synchronize del.icio.us with your Google search results.

(thx osde-info for the screenshot and braving Flickr’s policies against them)

Other del.icio.us projects:
  • LiveMarks – streams social bookmarks from various services and does some frequency analysis as well to highlight new and interesting stuff.
  • Delimages – hotlinks images from del.icio.us image streams.

Notes

Fixing my database

Over my Thanksgiving vacation I stopped checking on my projects for a week or so and that’s when everything decided to go to hell. The MySQL database dedicated to the del.icio.us projects ran out of space, and when I tried to fix it I wiped it :/ I’m pretty lax with these projects and I don’t do sensible things like use a VCS or make comprehensive backups, so it took a bit of work just to get things back to normal. There was also some code rewriting required, and I revisited a lot of queries in various services to bring them back online, such as in LiveMarks which now has a different popularity query and for delimages which now has a simpler tag query. Things should be back to normal though, so if you use any of these services, send me mail if you don’t like the changes or something is broken.

Making the dead link checker

The dead link checker turned out to be a little bit annoying, given that a lot of servers don’t respond properly to HTTP HEAD requests.

The checker works by making a bunch of prototype-driven Ajax calls to a little JSON outputting PHP script that makes a socket connection to the server to do a HEAD request for the URL.

This was fairly straightforward, however there were some tricky bits in that there are a range of weird responses to HEAD requests:
  • Server just hangs on Connection: close
  • Server decides to put the entire page HTML in the HEAD response (i’m looking at you IIS 6.0)
  • Server always returns a wrong response code for the page (Wikipedia and Google Video)

I didn’t do that much about special-casing completely wrong codes, so you’ll have to verify some yourself.

Updates to Kibbutz

I already had a Google sync project called Kibbutz which drives the XML translation of del.icio.us bookmarks to Google Co-op search results.

One thing you might have noticed however if you used it – it didn’t index all of your posts. Del.icio.us has a purposely limited API that requires login credentials to access your full list of public bookmarks, and I didn’t feel like writing an importer. Well I finally wrote one, and now you should be able to get all of your bookmarks indexed.

Now that I’ve written the importer script, it should be straightforward to write some more tools for working with your bookmarks – these will go under delicious-tools, watch that space.

User:alex: Alex Bosworth's Weblog

Is del.icio.us broken?

I’ve noticed lately that del.icio.us isnt’ servicing tag page requests properly. Take for example this page: http://del.icio.us/tag/ajax

Up until now – the latest posts were x minutes ago, or x hours ago, people constantly posting new links. Now things are delayed and only 1 day or older things are shown.

I’ve thought that del.icio.us had a lot of potential to change the way people find things on the net, but after being bought by Yahoo, I’ve seen no progress in this direction.

User:alex: Alex Bosworth - The Races

Cheat your way onto del.icio.us popular

It’s easy to get on del.icio.us popular and drive a bunch of traffic to your stuff: just set up various fake accounts and all bookmark the same thing at once.

Check this recent bookmark out for example: http://del.icio.us/url/b80e64803e8e83449aa519a6ef13080a

If you don’t notice why it’s suspicious, look at the first ten+ people who have bookmarked it, they have just joined in the past 2 days and have all bookmarked similar things.

I’ve started filtering some del.icio.us spam from LiveMarks, but this kind of thing is hard to stop automatically.

User:alex: Alex Bosworth - The Races

How To Provide A Web API

In a world where people are making interdependent webservices, API design and maintenance is pretty important. Unfortunately despite rising use and availability of APIs, there are significant problems with the way even big API vendors are deploying and maintaining their APIs.

What are a few simple rules for providing a web API?

  1. Keep it clean and simple
  2. Stick to standards
  3. Make it about data
  4. Keep it working
  5. Design for updates

Keeping it clean and simple is subjective and a matter of audience. For most developers a simple API is REST/HTTP based, with XML delivery of a known or simple schema, RSS being a good general choice. For JavaScript developers or plugin writers JSON feeds might be preferrable. For enterprise development scenarios, SOAP over HTTP might be better, but generally it’s best to stick with just REST + XML/RSS.

Simple also means don’t be too abstract. Flickr for example chooses in its API to require the use of its internal ids for all API calls. This means for example that every call to find information about a user requires a call first to find the internal id of the user. Del.icio.us on the other hand just requires visible names, in fact internal ids are hidden everywhere.

Sticking to standards is a matter of developing APIs that can plug in to accepted norms. Not only does this make development easier, it makes tooling and other peripheral services work better, and generally standards are written for a good reason. Using REST? Don’t use GET requests to update state, such as in del.icio.us’s urls to delete or add urls like https://api.del.icio.us/v1/posts/add?. Using RSS? Try to stick to mainstream semantically appropriate elements rather than new namespaces, provide well formed XML, don’t stick data in XML attributes, etc.

Make it about data. Leave application design to the application developer. Google’s new Ajax search results API is a good example of an API that isn’t about data, which makes less flexible to build upon. Instead of providing JSON feeds for plugin developers, Google has chosen to build out their own little search results box, with controls and results that cannot be styled, instead of leaving the interface and logic up to the Javascript developer or plugin writer. A better design would have been a simple JSON feed of Google service search results, and a reference object to build an embedded results box.

Keep it working. An application developer working with remote web services should design with the consideration that the remote service can malfunction or die, but that doesnt’ mean that service providers shouldn’t prioritize keeping reliable service high on their list. On SWiK and other development projects I’ve done, every one of the APIs we use (del.icio.us, sourceforge, google, etc) have gone down or had problems, and I learned the hard way not to depend on any of them.

Make a clean upgrade path. There’s no permanent APIs: add a version number and keep developers informed. Flickr calls don’t have a version number but they should. In del.icio.us even browser bookmarklets have versions. Salesforce.com, whose bottom line depends on web service APIs, uses versioned WSDLs. Each new rev of Salesforce.com’s APIs are given a unique WSDL and the backend from that point is kept stable once the WSDL has been issued. This has come into practice because just like native APIs, customers started to build code against buggy behavior, and when the server logic was updated to fix bugs, their code broke. Now if there’s broken code it stays in place, and developers migrate to new services at regular and scheduled intervals.

Recently del.icio.us updated their post API to use a secure encrypted URL, so as not to betray passwords or bookmarks in cleartext if they were posted using the API. A good move, especially as developers are using GET requests to post bookmarks, which may be prompting some routers to cache sensitive user data. Yahoo was nice enough to provide clear documentation and plenty of warning about the change though. After a few months of warning, the old insecure URL was turned off, and legacy requests are redirected to the secure URL, all in all a very good update.

On the other hand, del.icio.us recently updated their rss feed of recent bookmarks: http://del.icio.us/rss to be a bit more ‘digg-esque’. Instead of showing the steady stream of users adding bookmarks to their accounts, it now aggregates the popular urls, showing you something that is currently being bookmarked quite a bit. Well guess what? LiveMarks uses those front page rss feeds to aggregate del.icio.us bookmarking activity. The RSS feed change, which was completely unannounced, significantly impacted that application, and I wish they had given some warning that the URL was going to change, instead of silently changing it. (If you are hosting a mirror for LiveMarks and I haven’t contacted you, please change the aggregation url to http://del.icio.us/rss/recent instead of http://del.icio.us/rss/).

As a sidenote, a personal annoyance is the reluctance of service providers to provide APIs against what they consider to be their most important property: public user data.

Even though many times the data is made public in various forms, such as through RSS feeds or HTML pages, data like my favorites on Flickr or my older bookmarks on del.icio.us require authorization to access, which means as a developer the interface and code needs to be more complicated to use these APIs. The new Google Ajax Search API for example requires a separate API key I must apply for, I can’t use the Google search API key I use normally.

If I want to build an application for del.icio.us for example to offer a cool visualization of all your bookmarks, and you in context of other people who are close to your bookmarking activities, it’s essentially impossible without everyone volunteering their username and password in the clear to me, data about your bookmarks in del.icio.us is behind a firewall unless you sit on the RSS feed and store the aggregation. It’s the same with last.fm, who don’t even offer an API to recover your listening data (which is why I built a last.fm proxy). It’s up to del.icio.us or the service provider as to what they want to offer, but from the developer perspective, a lot of gratuitous authorization and api keys are essentially just another barrier to building the application I am interested in building.

User:alex: Alex Bosworth's Weblog

Is this the picture of a dying service?

del.icio.us appears to be doing fine to me :/

I don’t know what Michael Arrington is smoking when he says quote they’ve tanked completely

By my internal counters, del.icio.us has well over 300k users who actively post bookmarks. This is different from Digg or really any of the social sites out there besides the picture and myspace sites – these are people who are actively contributing to the del.icio.us resource with quality stuff.

Digg is pretty ephemeral by comparison, with Diggs occupying a perhaps similar volume of traffic thanks to the easy button, but the tagging and focus on new content means that Digg traffic lives and dies quickly, which may account for why Digg has eclipsed del.icio.us in traffic.

However look at the contributor numbers, people have recently claimed that Digg only has a thousand or far fewer active bookmarkers, and my internal numbers from my feeds back that up. LiveMarks has scrolled 60k digg bookmarks, and 4 million unique del.icio.us links over the past year or so.

What really seems to have improved at del.icio.us since Yahoo bought them is something that Michael Arrington likely would have a hard time selling as copy: their performance has improved drastically from a near cataclysm of broken pages, timeouts and database failures that were striking del.icio.us shortly before they were sold.

The feature improvements have indeed been scarse and the design is still unfortunate, but it’s still the best public bookmarking service out there by far (of course the new boz is the first and best and only truly private bookmarking service :P)

Sometimes features and visible improvements take a long time to brew, such as before Flickr released their Gamma update. There is a ton of work that needs to be done on del.icio.us, but I wouldn’t count Yahoo as having killed the goose yet.

PS: The comscore high Arrington refers to is likely bumped artificially high by press regarding the Yahoo aquisition, not droves of users suddenly signing up and using the service.

User:alex: Alex Bosworth - The Races

del.icio.us this bookmarklet

Here’s a bookmarklet to let you see who has bookmarked a URL at del.icio.us.

User:alex: Bookmarklets

del.icio.us on Google search

or “Kibbitzing the Google Kibbutz”

Google search recently launched a product called “Google Coop”.

I’m one for the odd mashup, so this project sounded fairly interesting – you get to make little vertical searches on top of Google to which other people can subscribe, kind of a mix between RSS and Adwords.

However the documentation for this project was clearly written by people who hate me and want to make my life miserable. The Phds up in the GooglePlex clearly have no need to explain their genius to common folk like me, instead opting for really helpful headings like:

“For Contributors”, “Topics”, “Subscribed Links”
Definition of the project: “Google Co-op is a platform which enables you to use your expertise to help other users find information. This is a work in progress; over time, you can expect to see evolution in both Co-op’s structure and the means by which you can contribute your expertise to our goal of making information more discoverable for millions of people. ” (ok but what does it actually do?)

I scratched my head for a bit thinking what these things might actually mean, luckily the excellent Google Blogoscoped quickly came out with a primer, that explained what the heck google coop actually did.

This weekend, looking for another Sunday project I revisited Google Co-Op – and built a little integration between personal del.icio.us bookmarks and Google. How it works is that you visit Google Co-Op and subscribe to my ‘subscribed links’ feed.

Here is a page that explains how to set your del.icio.us bookmarks up with Google Co-Op

If you just want to integrate your del.icio.us bookmarks with Google, you can stop reading here, I won’t hassle you with Google’s screwy terminology or how the annoyances I’ve found working with Google Co-Op.

And there are a number of these annoyances, even going beyond the documentation’s sore need of a basic glossary.

Basically, what I used is the ‘subscribed links’ functionality offered by google coop. This allows you to set an XML file with preset matches to search results. If someone searches for “What is the air speed velocity of an African Swallow”, you can specify they should get a link up at top of their search to the gorgeofnothingness.com

The fun part in setting these up is that Google won’t let you have a sandbox to play around creating these subscribed links. In order to test anything, you have to upload your xml file to them and add it to your subscribed links. You then have to wait 5 minutes until they go and get your file. Then you have to study messages like “xml entity not recognized” (we don’t know how to deal with unicode). Then delete the subscription you added, re-add/re-upload and wait 5 minutes to test again. It’s a lot of fun. Did I mention sometimes if you delete a subscription and re-add it, Google retains its cache of the old subscription and won’t reload the new one?

Other than the hassle of getting started, the issues are fairly minor, mainly my opinion of their xml schema. I don’t like data in XML attributes, it’s a lot cleaner when it’s in nodes. And if you set up counting notations, stay consistant, either go 0,1,2,3 or 1,2,3—don’t mix both please.

Anyways, this was a quick project that has kept me up way too late, but if you want to see the simple source to the project you can visit the SWiK – Kibbutz wiki page.

If you want to see improvements to this project, shoot me a message, I will be looking at adding more or betterer information to the integration and I’d like feedback on if this is actually useful or what aspects of del.icio.us I could integrate with Google further.

Also if Joshua Schachter would expose an RSS feed for del.icio.us search as there is on most other things, or if there were more XML access to a user’s bookmarks, I could probably do a better job integrating the two services.

User:alex: Alex Bosworth's Weblog

Social Bookmarking flat this year?

Looking at the graphs for del.icio.us and its biggest followers, as I’ve just posted below, it’s interesting to note that growth for this year is fairly static.

I would have thought that Yahoo would be providing more support for del.icio.us, wasn’t one of the big reasons to go to a big web company to help take social bookmarking to the mainstream?

Digg however is still growing, thanks possibly in part to Kevin Rose’s rush to get features out the door like their much improved comment system, and possibly part of growing to fill the unrealized potential size of tech enthusiast market, digg continues to be first with tech news over everyone else, more now than ever.

The next stage of the social bookmarking game has got to be making services that deliver more value to the mainstream web user and refinining the anti-spam and social heuristics. (LiveMarks unfortunately saw a lot of spam today through del.icio.us user sexkitten], who posted seven and a half thousand spam bookmarks in the space of ten hours.

I have to note however that Alexa has been most unkind to SWiK lately, completely unjustifiably, we’ve just had our best week ever according to our internal stats but Alexa shows us down, which is kind of irritating.

Our user base isn’t well tuned for Alexa, having a third of the users not running Windows and fewer than a third running IE, but it’s still annoying to see our graph go down when we are actually attracting quite a large audience.

User:alex: Alex Bosworth - The Races

James Governor's MonkChips: Things We've Learned: Josh Schachter, Quotes of the Day

nice summary of Joshua's philosophy when it comes to building del.icio.us

User:alex: My Bookmarks

Yahoo logo appears on del.icio.us

Maybe as a precursor to further integration with Yahoo, or simply a result of the datacenter move, del.icio.us now shows the Yahoo Page Not Found screen for 404s.

I’m still waiting to see that “Yahoo Company” logo appear on del.icio.us …

User:alex: Alex Bosworth - The Races

Social Bookmarking Vs Spam

Social bookmarking is currently in a high growth pattern. I notice it all the time running LiveMarks, the rate of social bookmarking is increasing every single day.

The more people flock to something, the bigger target it is for abuse. The more people turn to del.icio.us to find useful bookmarks, or digg.com to read the latest news, the more tempting targets those services become for spammers and vandals.

As Clay Shirky has noted, “Social software is stuff that gets spammed.” Perhaps as an addendum to that aphorism it should be noted that successful social services are those that can resist spam successfully.

The spamming foes of social software are just starting to take shape. Witness a new service called ‘TagExplosion’, TagExplosion describes itself:

“[TagExplosion] helps get your program, blog, advertisement or website on the several “Social Bookmarking” sites so you can get on 100,000’s of computers, worldwide. This will help drive more traffic to your website by utilizing the “word-of-mouth” aspect of Social Bookmarking.

Services like these are bound to multiply and exploit every weakness in social bookmarking’s defenses against spam and general social filters such as Digg’s front-page promotion mechanism or del.icio.us’s popularity or aggregation filters.

At the moment, the defenses of the social bookmarking sites are fairly weak. Currently del.icio.us has certain filters that destroy obvious spammers, prevent excessive tagging from aggregating onto every tag page, and prune the front page’s new bookmarks list. But del.icio.us’ only solid defense against automated attack is their signup captcha. As we’ve seen with BlogSpot, a signup captcha is only worth what it costs a spammer to pay an Indian or Filipino company to fill in.

del.icio.us may soon have to contend with spammers that post automatically or through peer to peer schemes such as TagExplosion or hide their spam like steganographers, continuously studying filters to evade and manipulate them.

Digg has similar problems, and is currently using similar techniques with similar vulnerabilities. As Digg is a more socially oriented site, it also uses more socially oriented solutions to fight off the spam problem. Users are encouraged not to post duplicate stories by having to first wade through other users’ submissions, and once posted other users quickly respond and post complaintative comments against problematic links. Users of Digg also help fend off spam by reporting what is wrong to the administrators and to the other users watching the live queue.

As frequent Digg users are well aware, Digg has had a tough battle in the recent times fighting vandalism. So far however, only a tiny minority vandalize the site, spammers are still thankfully somewhat in the dark as to how they might abuse the system.

This is all very reminiscent of the rise of search engines, Compare a search on Altavista for viagra to the same search on Google. Search engines that failed, failed in large part because spammers eventually reverse engineered the algorithms they used to determine relevance and simply rearranged their pages to suit those algorithms. The top result for viagra: ‘t-e-x-a-s-poker.com’ on altavista is the end result of the eventual failure of the first generation algorithms.

It’s likely that to survive and remain useful, social services will have to follow in this mold and emulate Google by switching their filters to factors that are more expensive and more complicated to engineer, such as third party references, longevity, references from known trusted sources, and patterns of human created content. Or they may be forced to become less social, such as with Wikipedia’s recent policies. Even then, it’s certain to be an ongoing battle.

User:alex: Alex Bosworth's Weblog

Ari Paparo Dot Com: Getting it Right

why del.icio.us got things right

User:alex: My Bookmarks

Delimages

I started another project this weekend, called “Delimages”.

del.icio.us exposes a special tag called system:filetype:jpg, but it’s pretty boring to subscribe to these feeds, you can’t see any of the images, just the titles. I wanted a way to be able to see the feed as images, so I coded up Delimages which can show del.icio.us tag or user image feeds as images.

Also, I found that image posting on del.icio.us can be fairly repetitive, so I cleaned up the repetitions.

Check it out. (Be forewarned, this is the raw unmoderated image stream with all that implies.)

User:alex: Alex Bosworth's Weblog

Aquisitions

My favorite web service next to Google/GMail, del.icio.us was recently bought by Yahoo! Inc.

It’s interesting to see the reactions in the comments by del.icio.us users to the news, after all, the del.icio.us user community is really a big part of the property that is being sold here.

It’s definitely not all positive, people are especially concerned about identity issues, single-sign-on is a dream not shared by all.

And I think some concern is justified, it will be interesting to see if Yahoo is able to keep building del.icio.us into something great or if it will choke under the weight of the corporate yoke.

Thinking back on other takeovers isn’t all reassuring:

Yahoo

  • Geocities rose to great popularity and was purchased in the dot com bubble, but over time became a synonym for worthless site. No one appears to have worked on this site since 1999, building a personal website is still way harder than it should be, especially with new Javascript techniques that could make building a simple personal page a lot easier.
  • Overture was a great investment for Yahoo, does a not often noted but great job at delivering targeted advertisements.
  • Flickr – aside from yahoo identity merging annoyances, Flickr seems to continue to prosper. Innovation seems to be slowing a bit there and new features like interestingness often don’t have RSS feeds, but things seem to be ok. So far it seems kind of like China taking over Hong Kong.
  • Continued Buying Spree – Dialpad, popular VOiP service among college students, Konfabulator – silly/nifty widgets for Windows ala OSX Dashboard, share in Alibaba.com – ebay for Chinese factories, Upcoming.org – valley hyped events site – still unclear how these will work out.

Google

  • Deja – the usenet repository which Google coopted into launching Google Groups. Many people have been extremely critical of the Google Groups product, and it hasn’t gained the traction that Yahoo Groups has seen.
  • Blogger – seems to have been the bastard child at Google, since purchase has lagged in innovation, and suffered speed and spam problems (spam problems continue). Although Blogger is a huge site for Google, MSN spaces grew out of nothing to become a strong competitor.
  • Picassa – a strong product, but so far no web presence means that Google is at the moment letting Yahoo win unchallenged in online photos with Flickr.
  • Keyhole – product was mapped into Ajax to become Google Maps, to overwhelming success and critical acclaim. Google seems to be a bit behind the curve however on continuing to innovate on Google Maps but lucked into creating a great community around Google Map mashup development.
  • Urchin – web analytics product turned free as Google Analytics, which despite lacking the trademark Google Beta badge suffered from scaling issues at launch

Microsoft

Microsoft is actually expert at aquisition, typically choosing companies that are strategically important to own such as Fox, or products that fill small but critical gaps in their product line – such as Visio, MS Project, Virtual PC.

Generally it seems to me as if Aquisitions are a toss-up. Sometimes an aquisition ends up like AOL’s purchase of ICQ or Nullsoft and ends up destroying the success of the company. Other times a small company like Keyhole needs a huge stage like Google to succeed. I think the key to aquisitions is that the corporate parent has to work hard to ensure that they don’t squander the potential of their new resource.

I’m not a hundred percent sure if del.icio.us will work under the Yahoo banner, but I’m definitely hoping it will. I’ve been wanting del.icio.us to move beyond ‘CSS-technique of the week’ to a much broader audience for a while now. In any case, purchasing del.icio.us was a very wise gamble by Yahoo, as they have seen with My Web 2.0, building it yourself often doesn’t work, especially when community is a core product.

Yahoo is definitely doing a good job on the rhetoric front and continues to purchase great small startups, but it is a matter of execution to see them beat Google.

User:alex: Alex Bosworth's Weblog

Wink

Fancy del.icio.us knock-off

User:alex: My Bookmarks

Kibbutz

Kibbutz is a simple project to integrate Google and del.icio.us through Google co-op.

To have your del.icio.us bookmarks appear next to your Google search results, follow the instructions outlined here.

Kibbutz may be expanded in the future, subscribe to the rss feed for this page to be updated on future capabilities.

The code for Kibbutz is at the moment quite simple, it leverages the livemarks and magpierss projects to handle del.icio.us bookmarks.

External Links

delimages

Delimages is a project to make feeds from the images bookmarked by users on del.icio.us.

It can show you a user’s bookmarked images or images for a given tag. (Example: tag cat).

by default, the most recently bookmarked images are shown.

Warning: the images from del.icio.us are not filtered, the raw stream of bookmarked images is shown.

See Pictures Of:

priv.at

priv.at is a project to allow users to save bookmarks anonymously on del.icio.us – the social bookmarking service.

Notice – If you have bookmarks saved with priv.at, please visit Boz – this is the successor to this project and your missing bookmarks can be imported here.

priv.at is now deprecated, although the code should still work if you wish to use your own private.bookmarks account.

Users can create a unique bookmarklet for themselves, which they then trigger to post the bookmark to their account.

This hack takes advantage of del.icio.us’ for: tag feature, which allows users to bookmark urls for their friends invisibly.

LiveMarks

LiveMarks is a project to show del.icio.us and other services’ bookmarks live.

On the left of LiveMarks you can see most recently popular bookmarks. On the right, bookmarks scroll by as people bookmark them.

Clicking on .oO links launches them in a new browser window.

Links