Extra Credit: Audio Scrobbler

Due: Wednessday, December 7 (max 10 points) In this assignment, you will investigate the AudioScrobbler system used to power the last.fm site and experiment with music data stored on the Duke Scrobbler site.

Part 1: Gathering Data

You should have already signed up for AudioScrobbler from Lab 3. In this part of the assignment you will:
  1. Follow the directions in the Duke Scrobbler documentation to log on to our version of .
  2. Sign up for last.fm and the CompSci 1 group. See the help page for more information.

    Writeup: Record your last.fm and facebook logins.

  3. Listen to at least 15 songs on your iPod or computer and upload the listening data to both last.fm and Duke Scrobbler. The songs that you listen to should reflect your current musical tastes.

    Check your profile page to see if your tracks are displayed. It sometimes takes a few days for the tracks to show up on last.fm, while tracks should show up immediately after being submitted with Duke Scrobbler. See the last.fm help pages or send mail to scobbler-admin@cs.duke.edu for more information.

  4. In a paragraph, characterize your music interests. How do the songs that you listened to reflect those interests?

Part 2: Recommender Systems

You can view everyone's top artist and track lists. last.fm has a number of services set up so that you can find people with similar musical interests, listen to your favorite music, and discover new music that you should like.

Most of these services are based around the concept of a neighbor. Your neighbors are supposed to be people with similar music taste to you. How are neighbors calculated? Here's what they have to say?

We have developed an especially perverted type of probabilistic latent semantic analysis. Profiles are decomposed using a custom algorithm based on relative popularity of items, then organised using latent class analysis.

The authors appear to be being deliberately vague here, but there is a great deal of work on such systems. Latent semantic analysis is often used in collaborative filtering systems. Collaborative filtering systems make predictions about the interests of a user by generalizing from taste information collected by the collective user community. AudioScrobbler is a type of recommender system that collects data on user behavior and uses collaborative filtering to recommend other songs. The details of latent semantic analysis may be beyond the scope of this course, but the general ideas are still somewhat accessible.

There are two good survey papers on Blackboard:

Using the information from the papers, the data from Duke Scrobbler and on the last.fm profiles, and the help pages and forums on the last.fm site, answer as many of the following questions as you can as to the best of your ability.

    Duke Scrobbler

  1. Who would you rank as your closest neighbors on Duke Scrobbler? Why?

  2. Given the data that is available from Duke Scrobbler and the readings, describe an algorithm for determining a metric for determining how similar two users' musical tastes are.

    You should use the View TASTE file utility from Duke Scrobbler to see what the data would look for your proposed algorithm. Taste is a system for making recommendations based on different collaborative filtering algorithms.

    Last.fm

  3. How similar are your musical tastes to those expressed by your top neighbor in last.fm? Are many of the songs listed in his or her profile in your collection as well? What percentage of the Top Tracks - Overall listed in his or her profile do you own or listen to frequently? What about Top Artist - Overall?
  4. If you have a neighbors page, go to it and note your neighbors. Based on the observed neighbor assignments and the readings on recommender systems, how do you think the match values are calculated? What information is used from the playlist? What information is used from the actual music files? I am not interested in the absolute match values, but rather the relative values. For example, in what kind of system, would profile A be a closer match than profile B and vice versa.
  5. Design an experiment to test your hypothesis from the previous question.
  6. Who would be your neighbors from the CompSci 1 group? How could you create a graph of the connections between users? What would the vertices and edges represent?

    Feedback

  7. What do you think of the Duke Scrobbler interface and documentation? How does it compare to last.fm? Would you recommend the use of Duke Scrobbler to a friend? What tasks did you find particularly difficult or confusing?

Submitting

Write up your results and submit as either a Word or PDF document via Blackboard.