Fri, May 13, 2016

Attribyte's News Algorithm

Facebook’s algorithms have been in the news lately because it appears they are being influenced by humans. This is as good a time as any to clear the air: Attribyte’s news algorithm may also be biased by its controllers! In this post, I’ll try to describe how the algorithm works and where the bias is injected.

Attribyte’s applications cover specific topics, like tech news or the upcoming election, so each starts with a selection of sites relevant to the subject. These sources are culled manually; it is likely there are some great sites that won’t be considered. Attribyte finds articles by crawling RSS feeds. If a site doesn’t have a feed that can be discovered automatically, or found by a cursory look for an RSS icon, it won’t be included. The content of the feed also matters. If the feed’s posts don’t contain markup, outbound links can’t be extracted for use by the algorithm. The selection of sites may reflect some personal bias. Technical choices made by publishers may bias the algorithm as well.

Short “quick link” posts that once appeared on personal blogs, now mostly appear on Twitter. In addition to tweeting top stories every so often, @attribyte follows a few writers (and friends). For the tech site, their links are added to the mix, mostly as a test. (It isn’t clear if using Twitter as input makes the results better.)

Now that we have our human-biased input, let’s run the algorithm:

  1. Extract all the outbound links from posts made in the past day.
  2. Canonicalize them.
  3. Count the number of times each canonicalized link appears.
  4. Filter out those that don’t seem to be links to an article or tweet. (Bias!)
  5. Filter out links that appear to be “self” links within a blog network or related properties. (Bias?)
  6. Filter out those that appear less than N times. (Bias!)
  7. Match the top links to entries already in the system. If there’s no match, attempt to crawl the page to build an entry from metadata.
  8. Remove near-duplicate entries. (Bias?)
  9. Sort the remaining entries in chronological order to build the news page.

Please notice that the output results, don’t have to be part of the input entry set. For example, the two entries shown in the image above are not from sites that Attribyte follows for “politics.” This provides a level of serendipity and counteracts, I think, some of the human bias. It is also a good way for Attribyte to discover sites that should be included in the input set.

Thu, Nov 12, 2015

Attribyte Everywhere

Good news, everyone! Attribyte is now available on the desktop or as a mobile-friendly web application. If your device runs iOS, open this link in Safari and follow the directions to install Attribyte to your home screen as an app.

Trying Attribyte is both free, and risk-free. It doesn’t matter if you are using our Android app or the web version. We don’t make you register, sign in, or connect your Facebook or Twitter account to use Attribyte. Favorites and settings are stored using your browser’s local storage capability, not on our servers. We hope you’ll give it a try.

Mon, Feb 2, 2015

Introducing Attribyte

I’m happy to announce that Attribyte’s Android app is now available for free in the Google Play store.

Attribyte was formed in 2008 without any particular product in mind, just the following abstract:

“As the number of publishers and publishing applications accelerates it becomes impossible to consume the full stream of content. Aggregation, filtering, and algorithmic recommendation are essential, especially for the general public who are not inclined to invest the time required to craft their own reading experience using low-level tools. Existing solutions control what you see based on the whim of an algorithm, the opinion of an editor, or the collective wisdom of friends. These schemes may succeed in throttling the flow to a reasonable level, but they don’t provide any real personalization. Is there really a relationship between what your friends like and what you are interested in right now? Algorithms tend to surface only popular stories, not the best ones, or the ones that matter to you. Curators introduce their own bias, even if unintended. The reader should control what they see on their own personalized digest.”

Six years of development happened between this idea and today’s launch. In subsequent posts I’ll write more about what we’ve learned along the way, the system architecture, and some of the open-source projects used and created to produce the app…which wouldn’t have happened at all without a lot of help.

Attribyte is a family business. My wife, Adrienne, is a reliable editor, keeps me fed, watered, and reminds me to go outside sometimes. Between jobs, my brother Jake managed the branding and interaction design, and found some excellent partners to help create the product. Wade worked with us to craft a brand identity and created all the graphics. Last but not least, Nigel at Atomirex developed the Android app and helped us greatly improve our original design.