Development Popularity algorithm?

Discussion in 'Software' started by ch424, 15 Apr 2009.

  1. ch424

    ch424 Design Warrior

    Joined:
    26 May 2004
    Posts:
    3,112
    Likes Received:
    41
    Hi everyone

    I'm developing a website (http://www.thealligatoronline.com) where I want a 'most popular' box. You can see it on the homepage by scrolling down about halfway.

    Every time someone loads a page, the server stores the page address ('page') and a timestamp ('datetime') in a database.

    The SQL query to work out the popularity is
    Code:
    SELECT page,
    COUNT(page) AS hits,
    SUM(POW(1.1,(datetime-NOW())/86400)) AS popularity
    FROM stats_table
    WHERE page LIKE 'article/%' 
    GROUP BY page
    ORDER BY hits DESC 
    LIMIT 0,10
    As you can see, the popularity index for each page is

    SUMviews( 1.1-{age of each view in days} )

    However, this doesn't work very well. It gives too much preference to recent hits. I could replace 1.1 with a lower number, and keep tweaking until it's right, or I could ask everyone here for a different idea altogether! :p

    So, what would you do to calculate popularity?

    Thanks!
    ch424
     
  2. capnPedro

    capnPedro Hacker. Maker. Engineer.

    Joined:
    11 Apr 2007
    Posts:
    4,381
    Likes Received:
    241
    Calculate the number of views per day it's been up, then work from that basis?
     
  3. Firehed

    Firehed Why not? I own a domain to match.

    Joined:
    15 Feb 2004
    Posts:
    12,574
    Likes Received:
    16
    Your query doesn't appear to actually have the views in the popularity algorithm, unless you're doing the math for it somewhere else.
     
  4. capnPedro

    capnPedro Hacker. Maker. Engineer.

    Joined:
    11 Apr 2007
    Posts:
    4,381
    Likes Received:
    241
    SELECT page,
    COUNT(page) AS hits,

    It's that. Every time a page is viewed another row is inserted. (Caveat: this table will get biiiiiiiig for a large site). You're probably better off using a different method of tracking.
     
  5. azrael-

    azrael- I'm special...

    Joined:
    18 May 2008
    Posts:
    3,852
    Likes Received:
    124
    You mean something like a variable for storing the number of hits? :)
     
  6. ch424

    ch424 Design Warrior

    Joined:
    26 May 2004
    Posts:
    3,112
    Likes Received:
    41
    It's not as simple as I first said though - the database also stores the referrer. It needs to store each hit as a row, along with user and referrer data because when the site gets bigger, I eventually want to do 'you may also be interested in' like last.fm does. I know the last.fm/audioscrobbler database is actually just text files, and they write their own analysing software, rather than using a SQL database. At the moment (a few thousand rows in the table) that query takes 0.002 seconds to execute, so there are no size worries. When execution time gets longer, I'll just cache the results and only calculate every 5 minutes or whatever.

    capnPedro, thanks for that idea. I've given it a go and will see how well it works! :)
     

Share This Page