An Evaluation of PSI Accuracy

Introduce yourself to the community or chat with other users about whatever is on your mind
PrestoBix
Posts: 83
2381 Ratings
Your TCI: na
Joined: Thu Aug 20, 2015 6:48 am

An Evaluation of PSI Accuracy

Post by PrestoBix »

I have completed an analysis of PSI accuracy, so we can see to what degree the metric gives good predictions.

For reference, I use a 0 to 100 rating system.

Below is the summary and tabulation of the difference between my score and the given PSI (absolute value). Look at the tabulation, because that is were it tells you what percent of the time it gets within 1 point, 2 point, 5 point, etc.

I had 2,021 ratings at the time, only 1,947 of which had PSI's, so that's how many observations there are.

Summary:

Code: Select all

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
        diff |      1,947    9.667694    10.76866          0         78
Tabulation:

Code: Select all

       diff |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        103        5.29        5.29
          1 |        165        8.47       13.76
          2 |        163        8.37       22.14
          3 |        170        8.73       30.87
          4 |        150        7.70       38.57
          5 |        147        7.55       46.12
          6 |        122        6.27       52.39
          7 |        108        5.55       57.94
          8 |        101        5.19       63.12
          9 |         74        3.80       66.92
         10 |         65        3.34       70.26
         11 |         56        2.88       73.14
         12 |         57        2.93       76.07
         13 |         42        2.16       78.22
         14 |         43        2.21       80.43
         15 |         36        1.85       82.28
         16 |         26        1.34       83.62
         17 |         24        1.23       84.85
         18 |         24        1.23       86.08
         19 |         23        1.18       87.26
         20 |         17        0.87       88.14
         21 |         16        0.82       88.96
         22 |         11        0.56       89.52
         23 |         13        0.67       90.19
         24 |          8        0.41       90.60
         25 |         12        0.62       91.22
         26 |         11        0.56       91.78
         27 |         15        0.77       92.55
         28 |          7        0.36       92.91
         29 |         15        0.77       93.68
         30 |          4        0.21       93.89
         31 |          4        0.21       94.09
         32 |          7        0.36       94.45
         33 |          5        0.26       94.71
         34 |         15        0.77       95.48
         35 |          3        0.15       95.63
         36 |          5        0.26       95.89
         37 |          8        0.41       96.30
         38 |          4        0.21       96.51
         39 |          9        0.46       96.97
         40 |          2        0.10       97.07
         41 |          1        0.05       97.12
         42 |          8        0.41       97.53
         43 |          3        0.15       97.69
         44 |          5        0.26       97.95
         45 |          1        0.05       98.00
         46 |          7        0.36       98.36
         48 |          3        0.15       98.51
         49 |          4        0.21       98.72
         50 |          4        0.21       98.92
         51 |          3        0.15       99.08
         52 |          1        0.05       99.13
         53 |          1        0.05       99.18
         54 |          3        0.15       99.33
         55 |          2        0.10       99.44
         56 |          1        0.05       99.49
         57 |          1        0.05       99.54
         60 |          1        0.05       99.59
         61 |          1        0.05       99.64
         62 |          2        0.10       99.74
         63 |          2        0.10       99.85
         64 |          1        0.05       99.90
         67 |          1        0.05       99.95
         78 |          1        0.05      100.00
------------+-----------------------------------
      Total |      1,947      100.00
So, you can see on average, the PSI is off by 9.668 points, and I should note that on average the PSI is 5 points higher than my score, so on the whole it tends to overshoot more than undershoot, but understand also that the vast majority of the major misses are overshooting, and this may be something particular to me, where I sometimes tend to disagree with the consensus that a film is good, but rarely disagree with a consensus that a film is bad. If I remove all outliers, then the the average improves to 3.43.

The median difference is 6, and without outliers is still 6 (no surprise there).

The largest miss was Detroit (2017) at 78 over, which is also my lowest rated film.

The largest low miss was Piranha II, at 29 below, which is understandable because I rated it well as so-bad-it's-good experience.

Additionally, my PSI and eventual score have a correlation coefficient of 0.7, which is exactly the lower bound for "strong correlation." Interestingly, my PSI has a correlation coefficient of 0.86 with average IMDB ratings, which is quite strong.

In a regression where my score is the dependent variable and PSI is the only explanatory variable, this is the outcome:

Code: Select all

-------------+----------------------------------   F(1, 1945)      =   1845.19
       Model |  329784.882         1  329784.882   Prob > F        =    0.0000
    Residual |  347624.229     1,945   178.72711   R-squared       =    0.4868
-------------+----------------------------------   Adj R-squared   =    0.4866
       Total |   677409.11     1,946  348.103346   Root MSE        =    13.369

------------------------------------------------------------------------------
       score |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         psi |   1.229146   .0286143    42.96   0.000     1.173028    1.285264
       _cons |  -22.50157   2.207148   -10.19   0.000     -26.8302   -18.17295

Almost 49% of the variance in my scores is explained by PSI. This compares very well to other variables.
  • PSI: 49%
    IMDB Ratings: 35%
    Rotten Tomatoes scores: 35%
    Metacritic scores: 32%
    Budget: 27%
    Date Rated: 10% (I rated harder over time)
    Oscar Nominations: 9%
    Oscar Wins: 5%
    Year: 3%
    Runtime: 1%
    Box Office: 0%
If I try to create a predictor of my own to compete with PSI, even by using as much data as I can stuff into it (11 numerical variables and 168 boolean/dummy variables), I can only get the adjusted r-squared to 0.55, meaning 55% of the variance in my score is explained by the data. If I include PSI into it, it improves a bit, going to 0.58. While this is actually quite high for a measurement of human behavior, it really goes to show how excellent the PSI is as a predictor, as it it is miles more eloquent than the system I concocted, and produces nearly the same quality of results.

So, how does PSI do? It passes, and with flying colors. It astounds me that 50% of the time it is able to get within 6 points.

Brantank
Posts: 1
227 Ratings
Your TCI: na
Joined: Thu Sep 02, 2021 9:49 pm

Re: An Evaluation of PSI Accuracy

Post by Brantank »

Seeing as how its accurate for you I was wondering if you could let me know how you rate things. My recommendations are usually not accurate.

I personally assign my tv/movies with an American school system grading of

A+ (100-97),
A (96-93)
A- (92-90)
B+ (89-87)
B (86-83)
B- (82-80)
c's with the 70s and d's with the 60s, f's lower than 60

Do you have a better method for rating?

Post Reply