Page 1 of 1

TCI should be a square deviation

Posted: Fri Oct 20, 2017 12:24 pm
by andr
TCI should be a square deviation of differences of film ratings instead of just mean difference of ratings.
Let assume there are two users named User1 and User2 and we need to calculate the TCI between them. Let assume they have just two films in common, named x and y. User1 ranked the films as tiers x1, y1 and User2 ranked them as x2, y2.
Lets compare two scenarios
1. x1 - y1 = 1; x2 - y2 = 1
2. x1 - y1 = 0; x2 - y2 = 2

With the current implementation the TCI is the same in both scenarios:
1. TCI = (1 + 1)/2 = 1
2. TCI = (0 + 2)/2 = 1

But I believe a difference of 2 Tiers should weigh more then two differences of 1 tier. Instead of average difference of tiers the TCI should be calculated as a square root of the average square of tier differences. It is similar to the distance between two points, which is the square root of sum of squared differences of their coordinates:

Code: Select all

d = sqrt((x1-x2)^2 + (y1-y2)^2)


But here it would be an average square instead of a sum of squares:

Code: Select all

TCI = sqrt(((x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2 . . . )/n)

where
x1, y1, z1, ... rating tiers of User1 for films x, y, z...
x2, y2, z2, ... rating tiers of User2 for the same films.

So with the suggested implementation the TCI would be different in scenario 1 and 2 from above:
1. TCI = sqrt((1^2 + 1^2)/2) = 1
2. TCI = sqrt((0^2 + 2^2)/2) = 1.4

Re: TCI should be a square deviation

Posted: Fri Oct 20, 2017 2:44 pm
by andr
It is possible to compare the current and the suggested TCI implementation and see which one gives more accurate predictions. In order to do it, one might sum up the differences between "Your Score" and "Your PSI was..." for all films that already ranked by each and every user. The total average deviation (square or not) between them would estimate the accuracy of a method.

Re: TCI should be a square deviation

Posted: Sun Oct 22, 2017 7:25 am
by djross
x

Re: TCI should be a square deviation

Posted: Tue Oct 24, 2017 11:52 am
by mpowell
Thanks for the suggestion. I have to say that it's unlikely we'll implement this as the "default" TCI calculation, as we like the simplicity of the current algorithm. But we do collect ideas like this, and appreciate the thought that goes into them.

Re: TCI should be a square deviation

Posted: Sat Oct 28, 2017 1:49 am
by lisa-
most importantly, the current method is easily visualizable by people with no mathematical background. the root mean square error would not be.

Re: TCI should be a square deviation

Posted: Sat Oct 28, 2017 9:32 am
by andr
May be the current method is sufficient for some, but I believe there is much to be improved. The following observations give me these thoughts:

1. My "top user" is someone who rates all films as "100": https://www.criticker.com/rankings/elisaima/?p=1
I do not mind someone to do that but why it is my most close user? I do not rank them all the same.

2. My PSIs for the same films often change much, they sometimes seem to jump back and forth from day to day. It seems the system cannot find a stable value for PSIs. I do not check this, but may be my top TCI list is not stable, but is constantly changing a lot.

3. There are films with high PSIs which I would not watch at all and some films which are great from my viewpoint have low PSI. I would expect the prediction to be more and more accurate as I rate more films, but it does not seem to be the case.

Re: TCI should be a square deviation

Posted: Sat Oct 28, 2017 9:36 am
by andr
lisa- wrote:most importantly, the current method is easily visualizable by people with no mathematical background. the root mean square error would not be.

Thank you for your comment. I believe an average user does not need or want to know anything about the method of TCI calculation. The user only needs accurate PSI predictions in my opinion.

Re: TCI should be a square deviation

Posted: Sat Oct 28, 2017 11:08 am
by BadCosmonaut
Even if the current method is sufficient, if the system can be better then that is something that should be investigated over time. Just because it's currently 'good enough' doesn't mean it should never be improved.

Re: TCI should be a square deviation

Posted: Thu Nov 09, 2017 8:43 am
by 5Z5qjRCfM2
I emailed the people who run the site about this shortly after I joined. If you want to prioritize accuracy over clarity, then there are public versions of the algorithm Netflix used to use
http://www.commendo.at/UserFiles/commen ... gChaos.pdf
which will no doubt be the most accurate option.

In the reply from the site operators, they said that they value that the algorithm they use can be easily understood. I think this is a virtue, too. It does almost as well as Netflix's algorithm. A machine learning solution is a black box; it just produces answers but usually doesn't give a comprehensible reason why.

I suspect they may also not want to go to the effort of implementing the bigchaos algorithm, which is understandable. I don't feel like my recommendations are inaccurate.

andr: you can ignore users. I don't know if that removes them from you psi generation, but at least you won't see them on your tci list.