PSI glitch?

Encounter an error, or something which isn't working correctly? Please, let us know
djross
Posts: 1239
Your TCI: na
Joined: Sun Apr 16, 2006 12:56 am

PSI glitch?

Post by djross »

Maybe calling this a bug is going a bit far, but it is something I've never quite understood. The formula for generating a PSI for a film seems quite straightforward on the surface, but it also seems that there is an extra step that means that the prediction is shifted a little so that it matches scores that the particular user actually uses. This at least is how it has always seemed. But very occasionally there is a film whose PSI hasn't followed this last step: for example, in my case the PSI for Pastoral: To Die in the Country is 81, but I only use scores of 80 and 82.

Am I understanding how PSIs are generated correctly, and if so, is it some kind of bug that results in a PSI of 81 in this case?

By the way, if it were up to me, my vote would be to ditch this last step and make the PSI as "mathematically precise" as possible.

mpowell
Posts: 4193
Your TCI: na
Joined: Fri Sep 09, 2005 10:22 am

Re: PSI glitch?

Post by mpowell »

Good question! We don't actually "bump" the PSI to match up with the rating scale of a user. We can test this with a straight-forward test user that has 10 ratings. 10, 20, 30, etc through 100.

The PSIs for that user run the gamut from 0-100. Like, if the average "percentile" is 13%, the given PSI for the film is 13. So, I think it's working as you would prefer!

djross
Posts: 1239
Your TCI: na
Joined: Sun Apr 16, 2006 12:56 am

Re: PSI glitch?

Post by djross »

I don't think that can be right. 99% of the PSIs fall on scores that I do use, even though I use far fewer than half the numbers. For example, there are about a hundred PSIs of 80, far more than a hundred PSIs of 78, and zero PSIs of 79. This matches the fact that I don't use 79 for any of my ratings. And it corresponds to my normal experience that PSIs are (almost) always a number that I do use. Something more is going on here.

porzano
Posts: 9
Your TCI: na
Joined: Tue May 05, 2015 7:00 pm

Re: PSI glitch?

Post by porzano »

Say what?

I only use even numbers 2-4-6-8 all the way up to 100
and I almost never get a prediction with any other numbers than those.
My thought has always been that those very uncommon predictions with an uneven number
only occur when the prediction lands in the dead centre between two of the numbers that I use.

So that would have meant that someone using 10-20-30 etc could theoretically
get a 55 but never a 53 or 57 in a prediction.
So I was totally wrong on that assumption it seems :)

Then why don´t I get lots of predictions with odd numbers?
Is different things happening depending on how fine graded scale you use?

And finally I like it the way it works for me now and would not prefer finer graded predictions in between the scale that I use.
If I were to use 10-20-30 then I would have no interest in predictions of 14 or 16,
14 is TEN and 16 is TWENTY Basta!! ;)

djross
Posts: 1239
Your TCI: na
Joined: Sun Apr 16, 2006 12:56 am

Re: PSI glitch?

Post by djross »

Continuing the point, for me:
  • There are 5 PSIs of 88;
    There are 0 PSIs of 87 (I don't use 87);
    There are 12 PSIs of 86;
    There are 0 PSIs of 85 (I don't use 85);
    There are 31 PSIs of 84;
    There are 0 PSIs of 83 (I don't use 83);
    There are 28 PSIs of 82.
At the other end of the scale:
  • There are 7 PSIs of 0;
    There are 0 PSIs of 1 to 19 (I use only 1, 5, 10, 15);
    There are hundreds of PSIs of 20;
    There are 0 PSIs of 21 to 24 (I don't use 21, 22, 23 or 24);
    There are hundreds of PSIs of 25;
    There are 0 PSIs of 26 to 29 (I don't use 26, 27, 28 or 29);
    There are hundreds of PSIs of 30.
Given that there is no deliberate plan to nudge PSIs towards numbers used by the user, and given that there is no good reason why a prediction couldn't fall on any percentile, it seems to me that the explanation for this phenomenon must have to do with the way in which the final step is carried out: the step that "translates" the percentile generated by the PSI algorithm into a predicted rating. Something about how a user distributes their ratings can somehow mean that certain possible scores have no percentiles assigned to them.

This could be something that doesn't show up for a user who straightforwardly uses 10, 20, 30, etc., where there is lots of "space" between the used numbers to which percentiles could correspond. But for a user like me, who uses many but not all of the numbers, it can happen that a range of possible scores all correspond to one particular percentile. For example, for me the 0th percentile is covered by all the scores between 0 and 15, the 99th percentile corresponds to scores between 93 and 100, and the 98th percentile corresponds to scores between 91 and 92. So perhaps some kind of rounding process means that only one score is used for each percentile.

For me, the 96th percentile corresponds to a score of 88, while the 95th percentile corresponds to a score of 86 and the 94th percentile corresponds to a score of 84. Somehow the "translation" from the generated percentile to the predicted rating "follows" this, and the system decides that no percentile corresponds to a predicted score of 87 or 85. But then, a score of 82 corresponds to the 92nd percentile, and a score of 80 corresponds to the 90th percentile, so to what score are the 93rd and 91st percentile assigned, and why are there zero PSIs of 83 and a handful of PSIs of 81? The only answer I can think of is that the 93rd percentile must end up closer to either a score of 84 or 82 rather than 83, whereas the 91st percentile ends up closer to a score of 81 rather than 82 or 80.

But even here, I don't quite understand. As mentioned, Criticker tells me that when I give a film a score of 82, that corresponds to the 92nd percentile. But in fact, when I click on a PSI of 82 (as opposed to a score of 82 that I've actually given), the explanation of how that number is assigned says that "the 93rd percentile translates to a rating of 82". And when I click on a PSI of 81, the explanation of how that number is assigned says that "the 92nd percentile translates to a rating of 81". So there seems to be a discrepancy between scores and PSIs in terms of the percentile/rating relationship: the 92nd percentile corresponds to different things depending on whether it is a score I've given (in which case it corresponds to 82) or a PSI (in which case it corresponds to 81), and "82" can correspond to a percentile of 92 (if it's a score) or 93 (if it's a PSI). One possible explanation for such a discrepancy might be that rounding does not take place in the "translation" from percentile to PSI. But in that case, I still have no idea why, as is clearly the case, the PSI algorithm overwhelmingly tends to use numbers that I do in fact use for ratings.

P.S. The scores I do use are: 0, 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100. That is, 36 out of the 101 possibilities.

mpowell
Posts: 4193
Your TCI: na
Joined: Fri Sep 09, 2005 10:22 am

Re: PSI glitch?

Post by mpowell »

Alright, that is odd! You both are right... and now I'm wondering why in my test case it did provide me with PSIs that weren't exactly the same as my other ratings.

This is going to take some looking into. I'll open a report, based on DJross's original observation. My test case might end up being related to that.

mpowell
Posts: 4193
Your TCI: na
Joined: Fri Sep 09, 2005 10:22 am

Re: PSI glitch?

Post by mpowell »

Okay, we've taken a deeper look into this, and better understand the behavior. Basically, what we're doing is looking at the "Percentile" of the prediction, and then placing that into your list of ratings at the appropriate spot.

As an example, if you've got 10000 ratings, and we guess that the percentile of FilmX is 47.245, then it would come in at spot #4724.5 in your list of ratings.

So, we look at the score you've given #4725 and #4724, and calculate FilmX's score to be between those.

Usually, what happens for users that have rated many of films (and this is what explains the behavior you're reporting), is that the ratings for #4725 and #4724 will be the same -- they're both titles you've rated 60, for example. So the average of "60" and "60", is "60". It's only on rare occurrences when something like this happens, that you'll get PSIs for scores you don't use:

PSI Percentile: 38.234%
Rated Title #3824 = 42
Rated Title #3823 = 40

The PSI of a title in spot #3823.4 would be 41 (technically 40.8).

Hopefully, this explains why it's more common to see PSIs that land right on numbers you tend to use, but that it's possible to see those which don't!

Post Reply