Author Topic: Statistics update  (Read 12064 times)

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Statistics update
« on: October 04, 2018, 01:50:58 PM »
Hi Sol (and all others interested),

I have reached out to three professors at the local college whose main research interests are in Bayesian statistics and computations and simulations (2 math and 1 engg) to see if I had made a logical error when I formulated the inference problem.

Note I am effectively cold-calling them, or in this case, cold-emailing. I will be around for another two-three weeks until I move on to my next FIRE destination, where I will have limited connections of any kind. But if they don't get back to me within a week they prob can't be bothered to help.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #1 on: October 05, 2018, 11:49:18 AM »
Update

The Engg professor said he's away and "other activities demand his attention at this time"

One math professor replied with some comments, let's take a look together.

First, what i sent to each of them:

Although we have never met, I hope you don't find this email too abrupt. I decided to contact you because you are an expert in Bayesian statistics and computation, I am wondering if I could trouble you for a minute to help me wrap my head around this "statistical inference puzzle" I made up, because I think I fell into a logic pitfall somewhere along the way.

Fundamentally this is an inference problem (I think), I will use a question I saw on mathisfun as the template for the set up.  https://www.mathsisfun.com/data/probability-false-negatives-positives.html

Suppose you suspect that you are allergic to something, which 1% of the population does, you went to the clinic and did a test. The test turned out to be positive.
The test has a false positive rate of 10% and a false negative rate of 20%. What's the chance of you actually being allergic?

The standard 2x2 table would look like this

                              1% have it                      test yes                          test no
Have allergy                10                                 8                                      2
Don't have                 990                                99                                   891
                                1000                              107                                  893

so, 107 people are test positive but only 8 people are actually allergic; even with a positive test, your chances of being actually allergic is only 7%, despite the test having only 20% false negative rate.

Now, I am going to apply the same method for "actually being a criminal". In this hypothetical case, the numbers are just an estimate with assumptions built in (let's assume they are appropriate), but that is not the main issue I am concerned about.
I am assuming 5% of the population are criminals. 5% as the false allegation rate. Assuming only about 1/3 of the crimes are reported, so I am using 2/3 as false negative rate.

The 2x2 table would look like this

                                  5%                             accused                          not accused
Criminals                    50                                  17                                   33
Not Criminals             950                                48                                  902
                                 1000                                65                                  935

so, 65 people are accused but only 17 people are actual criminals; that's 26% chance of a person being actual criminal when accused of a crime, despite the false allegation rate being only 5%.

Now my questions are (if you find this interesting enough to answer): Was my framing/formulation of the problem appropriate? What logic trap did I fall in which yielded this puzzling result? Under what circumstances can I or can't I set up the table like this?


--------------------------------------------------------------------

His reply (whom we will call Math Professor #1 in case the other math professor replies later) Bold added.

"I really haven't time for a long discussion about this, so I will just make a few comments.

The situations are really essentially the same in the two examples. The numbers are different, but both illustrate the same sort of application of Bayes' Theorem. The takeaway is that conditional probability calculations (though perfectly logical) are often counter-intuitive and often seem to go against (most people's) common sense.

As far as I could see (from a quick glance), your calculations are correct (up to rounding error). You framed it fine. There is no logic trap. I think you are just confusing / mixing up two different conditional probability statements.

The 26% and 5% are conditional probabilities for two different events, and either one may (in general) be larger or smaller than the other.

In the first example, the ordering was one way. In the second example, it was reversed.

If you want to read more, I suppose the main related ideas for you to look into are conditional probability and Bayes' Theorem.

I have not read it myself, but this book has many favourable reviews:

https://www.amazon.com/Theory-That-Would-Not-Die/dp/0300188226


So no, I was not wrong, my framing was correct. Let "The 26% and 5% are conditional probabilities for two different events, and either one may (in general) be larger or smaller than the other. " for a bit.

what it means: False allegation rate is 5%; the person accused actually guilty 26%. They are internally consistent and tell the SAME story.

These numbers are also consistent when we look at the observable such as the low rape conviction rate: 4-8%. For the longest time people are wondering how could the conviction rate be that low when the false allegation rate is only 5%, the law enforcement and the justice system must all be rotten. To go from 95% to 4%! Unthinkable!

But no, that's not the case. conviction rate = (convicted cases / ALL cases, weather charged or not), so at best the conviction rate should not exceed the being guilty rate of 26%. If we were omniscient every reported rape would result in a conviction, but sadly we are not.

Out of this 26% not every case can be prosecuted (i am using 30%, 2/6) , and out of every case that actually goes on trial, of these, only some result in conviction (50%).

What's 26/3/2? Lo and behold: ~4%.

Yes, to go from 26% to 4% is STILL BAD and SHOULD BE IMPROVED. But again, this tells the SAME story just like the 26% and 5%. What I presented to you are indeed counter intuitive, but it is internally consistent and perfectly logical.

Davnasty

  • Magnum Stache
  • ******
  • Posts: 2793
Re: Statistics update
« Reply #2 on: October 05, 2018, 12:16:51 PM »
1% of the entire population is allergic to x. 10% of the entire population receives a false negative when tested for allergies to x.

5% of the entire male population is a rapist. 5% of the subset of the male population accused of being a rapist is proven to have been falsely accused.

Based on the inputs you gave to Math Professor #1, their response was correct. The inputs were not. If we were only testing the population that has been accused* of being allergic to x, the results would be different.

For the record I'm not stating that any of these numbers are accurate, only following the hypothetical situation presented

*snark
« Last Edit: October 05, 2018, 12:24:00 PM by Dabnasty »

former player

  • Walrus Stache
  • *******
  • Posts: 8822
  • Location: Avalon
Re: Statistics update
« Reply #3 on: October 05, 2018, 12:19:02 PM »
Your 5% false accusation rate.  That's 5% of the accusations made are false?

So that's 5% of the accused population of 17, not the non-accused population of 950, right?  Because you can't have accusations within a population that you've just described as not accused, right?

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #4 on: October 05, 2018, 12:25:58 PM »
As we've previously tried to explain, your math is fine but you're solving the wrong problem.  We're not interested in the overlap of the two probabilities you have posed, we're only interested in which people who have already been accused are guilty, which is transparently given by fp in the problem statement without the necessity of any additional information, parallel probabilities, or other populations.

Chances are good that the people you contacted would tell you this too, if you just explained to them the real problem you are trying to solve about rapists.  They will understand what you do not, that the infection rate problem and the rapist problem are structurally dissimilar.

shenlong55

  • Pencil Stache
  • ****
  • Posts: 528
  • Age: 41
  • Location: Kentucky
Re: Statistics update
« Reply #5 on: October 05, 2018, 12:32:11 PM »
I know it's called a "false allegation rate", which sounds similar to "false positive rate", but you need to look at how the actual numbers your using (2-10%) were derived, not just what it's named.  I'll admit, I haven't looked at the actual paper, but the description on Wikipedia sure makes it seem like it's derived differently than a false positive rate.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #6 on: October 05, 2018, 12:42:04 PM »
As we've previously tried to explain, your math is fine but you're solving the wrong problem.  We're not interested in the overlap of the two probabilities you have posed, we're only interested in which people who have already been accused are guilty, which is transparently given by fp in the problem statement without the necessity of any additional information, parallel probabilities, or other populations.

Chances are good that the people you contacted would tell you this too, if you just explained to them the real problem you are trying to solve about rapists.  They will understand what you do not, that the infection rate problem and the rapist problem are structurally dissimilar.

The Professor was explicit, my framing was correct, and the two questions are essentially the same, so no, my conclusion was correct. If you feel this is an error on his part, you can do what I did, reach out to current scholars whose main research areas are in Bayesian Statistics and Computations, see what they have to say. Look, you and I both have advanced degrees involving statistics, and we are clearly butting heads.

We can sit here and argue forever, why not find someone impartial to judge, like I did?

Also, I did not identify so much of an "overlap", rather I instead identified given X what is Y. You have to understand, the 26% and 5% describe two different events like the professor said.

5% describes false allegation rate; 26% describes odds of a person being actual criminal when accused of a crime. They tell the same story.
« Last Edit: October 05, 2018, 12:44:52 PM by anisotropy »

Davnasty

  • Magnum Stache
  • ******
  • Posts: 2793
Re: Statistics update
« Reply #7 on: October 05, 2018, 12:57:00 PM »
If we did in fact have a false accusation rate for the entire population of the US, perhaps we could use it as the input.

In 2012 there were 87,000 reported rapes; 87,000 * .05 = 4350
The average male life expectancy is 75.5; 75.5 * 4350 = 328,425
The US population in 2012 was 312,800,000; 328,425 / 312,800,000 = ~.001

.1% is closer to the actual (proven) false accusation rate even though my method for finding it was pretty sloppy. I realize the first line here suggests something anisotropy is still not in agreement with, but I thought looking at the numbers from another angle might help.
« Last Edit: October 05, 2018, 01:08:00 PM by Dabnasty »

JLee

  • Walrus Stache
  • *******
  • Posts: 7512
Re: Statistics update
« Reply #8 on: October 05, 2018, 12:59:21 PM »
what it means: False allegation rate is 5%; the person accused actually guilty 26%. They are internally consistent and tell the SAME story.

You're arguing that the true allegation rate is 95% and an accused person is actually guilty 26% of the time.

19 out of 20 are telling the truth, but 3 out of 4 accused are innocent?  There's no fucking way that math works.   

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #9 on: October 05, 2018, 01:14:47 PM »
The Professor was explicit, my framing was correct,

Yes, the answer was clear and your math is fine.

Quote
and the two questions are essentially the same

No, not even close.  Very much not the same, which is the whole problem here. This would be revealed to you if you would just ask the authority.  Why did you choose to ask about the infection rate problem, and not the rape problem?  Go ahead, repose the same question and then tell them that you think 75% of people accused of rape are innocent because everyone gets randomly accused of rape and only a small number of them are rapists, so there must be lots of false allegations out there.  If they're polite, they won't laugh at you.

If you won't believe all of us, maybe you'll believe them.

Quote
If you feel this is an error on his part,

Not on his part, on your part.  You've misapplied the solution to a non-analogous problem.  You keep saying "Brett Kavanaugh is just some random dude to me" but he is not some random dude, he's a specific individual who has been credibly accused or sexual assault.  The math used to determine the infection rate in whole populations is irrelevant, because you don't go around randomly accusing people of sexual assault the same way you randomly go around testing people for infection. 

It's the wrong problem.  Your stats prof told you that you have correctly solved one problem, and I agree, but then you have falsely applied that solution to a different and only tangentially related problem.

If you don't believe me, just pose your real question to the stats prof. 

Don' claim a correct answer to a different problem means you have this problem right. 

Also, you're pissing off lots of people on the forum with your stubborn adherence to something only a gross dude would try to advocate.  It's okay to say that you love stats, but mistook these two problems as identical when they are not, and aren't actually accusing the vast majority of sexual assault victims of being liars.  Because right now, that is exactly what you are doing.




anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #10 on: October 05, 2018, 01:17:03 PM »
If we did in fact have a false accusation rate for the entire population of the US, perhaps we could use it as the input.

In 2012 there were 87,000 reported rapes; 87,000 * .05 = 4350
The average male life expectancy is 75.5; 75.5 * 4350 = 328,425
The US population in 2012 was 312,800,000; 328,425 / 312,800,000 = ~.001

.1% is closer to the actual (proven) false accusation rate even though my method for finding it was pretty sloppy. I realize the first line here suggests something anisotropy is still not in agreement with, but I thought looking at the numbers from another angle might help.

Hi Dabnasty,

I am not sure how to phrase this, so please bear with me. When we say false allegation rate being 5%, it means, there is 5% chance the allegation is false when an allegation is made. And that's it. This tells us nothing about the general population.

But together with a fn rate. We can tackle it as an inference problem, as I had done here.

Now there is a good reason to doubt 0.1% being the actual false accusation rate.

In statistics, when we frame these sort of problems, we often have to deal with type I and type II errors. Namely, false positive and false negative. They tend to be offsetting each other, what I mean by that is, the smaller type I, the bigger the type II and vice versa. When your fp is 0.1%, generally your fn would be extremely high, perhaps as high as 99%. I find this quite unrealistic, don't you?

But who knows, maybe it is 0.1% for the entire population. But you have to remember, this 0.1% tells a different story. It's no longer when an allegation is made, 5% being false. The 0.1% would mean in the population, 0.1% of the time, you would be accused of rape (randomly). You see how the orders change? I hope this helps.
 
JLee,

please read the professor's comment. Most people do find it counter intuitive, if you still cant understand, I cant help you. maybe read the book he recommended.

JLee

  • Walrus Stache
  • *******
  • Posts: 7512
Re: Statistics update
« Reply #11 on: October 05, 2018, 01:19:35 PM »
JLee,

please read the professor's comment. Most people do find it counter intuitive, if you still cant understand, I cant help you. maybe read the book he recommended.

If you were right, society would be rife with falsely accused people.  It isn't.

What is it about not understanding..?

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #12 on: October 05, 2018, 01:27:41 PM »
When we say false allegation rate being 5%, it means, there is 5% chance the allegation is false when an allegation is made. And that's it. This tells us nothing about the general population.

Brett Kavanaugh is not part of the general population, he is already accused.  You just agreed that there is a 5% chance that the allegation is false when an allegation is made.  An allegation has been made.  Do you still think there's a 5% chance he's innocent, or do you think it's a 75% chance he's innocent?

Quote
please read the professor's comment.

The prof's comments are totally irrelevant to the rape problem, because you didn't ask about that.  You asked about an infection rate problem which is not the same.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #13 on: October 05, 2018, 01:27:56 PM »
The Professor was explicit, my framing was correct,

Yes, the answer was clear and your math is fine.

Quote
and the two questions are essentially the same

No, not even close.  Very much not the same, which is the whole problem here. This would be revealed to you if you would just ask the authority.  Why did you choose to ask about the infection rate problem, and not the rape problem?  Go ahead, repose the same question and then tell them that you think 75% of people accused of rape are innocent because everyone gets randomly accused of rape and only a small number of them are rapists, so there must be lots of false allegations out there.  If they're polite, they won't laugh at you.

If you won't believe all of us, maybe you'll believe them.

Quote
If you feel this is an error on his part,

Not on his part, on your part.  You've misapplied the solution to a non-analogous problem.  You keep saying "Brett Kavanaugh is just some random dude to me" but he is not some random dude, he's a specific individual who has been credibly accused or sexual assault.  The math used to determine the infection rate in whole populations is irrelevant, because you don't go around randomly accusing people of sexual assault the same way you randomly go around testing people for infection. 

It's the wrong problem.  Your stats prof told you that you have correctly solved one problem, and I agree, but then you have falsely applied that solution to a different and only tangentially related problem.

If you don't believe me, just pose your real question to the stats prof. 

Don' claim a correct answer to a different problem means you have this problem right. 

Also, you're pissing off lots of people on the forum with your stubborn adherence to something only a gross dude would try to advocate.  It's okay to say that you love stats, but mistook these two problems as identical when they are not, and aren't actually accusing the vast majority of sexual assault victims of being liars.  Because right now, that is exactly what you are doing.

Real question as in replace crime with rape? Ya that would go well, hey we've never met but let's talk about rape.

Let me ask you this, what difference does the kind of crime make? Surely by your logic, the false accusation rate would be much higher when it comes to actual criminals and lower regarding the general population anyway no matter the crime??


And stop calling this gross or imply bad intention on my part, this is deeply offensive. I am telling you your view is wrong. If you don't believe me just reach out to a Bayesian Stat expert/scholar at your local college.

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.

Like I said to Debnasty, the 0.1% would tell a different story. It's no longer when an allegation is made, 5% being false. The 0.1% would mean in the population, 0.1% of the time, you would be accused of rape (randomly). You see how the orders change? I hope this helps.

It is clear to me now you do not seek to discover objective reality but rather are more interested in "removing ammunition" from the other side, I find this partisan behavior rubbish to say the least.

Quote
You asked about an infection rate problem which is not the same.

and omg, you really missed the table I sent to him about crime? I am done, seriously.
« Last Edit: October 05, 2018, 01:30:46 PM by anisotropy »

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #14 on: October 05, 2018, 01:29:49 PM »
I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent."

No one is randomly accused of rape.  People can be randomly tested for infection.  No one is randomly accused of rape.  All math based the assumption that random people are accused of rape is irrelevant to the situation in which a person has already been accused.



anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #15 on: October 05, 2018, 01:32:30 PM »
I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent."

No one is randomly accused of rape.  People can be randomly tested for infection.  No one is randomly accused of rape.  All math based the assumption that random people are accused of rape is irrelevant to the situation in which a person has already been accused.

Go talk to a Bayesian Statistics expert at your local college as I have repeatedly suggested. I am done with you here.

FrugalToque

  • Global Moderator
  • Pencil Stache
  • *****
  • Posts: 862
  • Location: Canada
Re: Statistics update
« Reply #16 on: October 05, 2018, 01:34:10 PM »

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.


Why are you still going on about this?

One of the premises of your question is that the false accusation rate is 5%.  That means that the probability of a person being falsely accused of rape is 5%.

How can you ignore the premise of your question and play statistical games with this really, really basic fact?

Toque.

FrugalToque

  • Global Moderator
  • Pencil Stache
  • *****
  • Posts: 862
  • Location: Canada
Re: Statistics update
« Reply #17 on: October 05, 2018, 01:36:27 PM »

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.


Why are you still going on about this?

One of the premises of your question is that the false accusation rate is 5%.  That means that the probability of a person being falsely accused of rape is 5%.

How can you ignore the premise of your question and play statistical games with this really, really basic fact?

Toque.

What are you going to do next?  Re-apply the 26% to your original filter?

Then come out with a 75% false accusation rate?  Do you understand the fundamental, mathematical absurdity of what you're doing?

Toque.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #18 on: October 05, 2018, 01:37:07 PM »

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.


Why are you still going on about this?

One of the premises of your question is that the false accusation rate is 5%.  That means that the probability of a person being falsely accused of rape is 5%.

How can you ignore the premise of your question and play statistical games with this really, really basic fact?

Toque.

Toque, with respect. Please read the professors comment "the two numbers describe two different events."

One is false allegation, the other is guilty, they are NOT the same thing.

I seriously don't understand why it is so difficult for people to wrap their heads around this.

Also absurd? I have to say, you guys are the absurd ones. When one has two allegations against them, it goes up to 70% precisely because you would then replace the 5% population to 26%. This is just how the math works.

Which is why the legal system strongly leans on a PATTERN of behavior. Re: multiple allegation => much higher chance of being guilty.
« Last Edit: October 05, 2018, 01:39:16 PM by anisotropy »

JLee

  • Walrus Stache
  • *******
  • Posts: 7512
Re: Statistics update
« Reply #19 on: October 05, 2018, 01:38:36 PM »

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.


Why are you still going on about this?

One of the premises of your question is that the false accusation rate is 5%.  That means that the probability of a person being falsely accused of rape is 5%.

How can you ignore the premise of your question and play statistical games with this really, really basic fact?

Toque.

Toque, with respect. Please read the professors comment "the two numbers describe two different events."

One is false allegation, the other is guilty, they are NOT the same thing.

I seriously don't understand why it is so difficult for people to wrap their heads around this.

My friend, we are all wondering the same thing about you right now.

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #20 on: October 05, 2018, 01:42:16 PM »
Go talk to a Bayesian Statistics expert at your local college as I have repeatedly suggested. I am done with you here.

Sounds like you are the one who needs to go ask your question.  I already know the answer.  For some reason, you are refusing to ask them the real question that you have, and are instead asking a different question.

Note that not once in your correspondences with the stats prof did you mention sexual assault or false allegations of sexual assault.  You only asked about infection rates, which is not the same problem.  For reasons I have tried to explain to you over and over again.

If you were to pose the actual problem you are claiming to have solved, they will set you straight.  Go ahead, I'll wait.

former player

  • Walrus Stache
  • *******
  • Posts: 8822
  • Location: Avalon
Re: Statistics update
« Reply #21 on: October 05, 2018, 01:43:20 PM »
This is your table -

The 2x2 table would look like this

                                                                 accused                          not accused
Criminals                    50                                  17                                   33
Not Criminals             950                                48                                  902
                                 1000                                65                                  935


The problem with this table is that you've laid it out the wrong way around.  You can't start the problem with who is a criminal and who is not, because who is a criminal comes after the accusation.  So you start with who is accused, and then of the accused population you look at who is guilty and who is not.  So, using your figures -

                                                                 guilty                          innocent
Accused                    17                                  16                                  1
Not accused              983                                 33                               950
                              1000                                49                               951


And really, you say you are an ally.   I don't know what's behind your fanaticism in trying to prove an unrealistic false accusation rate, but whatever it is is blinding you to logic and to what is needed to be an ally of victims of sexual violence.




« Last Edit: October 05, 2018, 01:44:59 PM by former player »

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #22 on: October 05, 2018, 01:46:38 PM »
I already know the answer. 

lol ok , then what's the harm in asking an expert that is impartial to judge it as I had done?

Reminds me of how BK didn't want to have a full FBI investigation.

By actual problem you mean exchange words crime with rape? You didn't answer my question, what difference does the kind of crime make? Surely by your logic, the false accusation rate would be much higher when it comes to actual criminals and lower regarding the general population anyway no matter the crime??
« Last Edit: October 05, 2018, 01:48:16 PM by anisotropy »

FrugalToque

  • Global Moderator
  • Pencil Stache
  • *****
  • Posts: 862
  • Location: Canada
Re: Statistics update
« Reply #23 on: October 05, 2018, 01:47:59 PM »

I want to make it VERY CLEAR, as I have repeatedly done. All the 26% says is: "When a random person is accused of rape by a single alleger,there is a 26% chance  is actually innocent." This has nothing to do with BK specifically. Given any person, the statement stands. The statement is simply a logical statement given 5% fp and 66% fn.


Why are you still going on about this?

One of the premises of your question is that the false accusation rate is 5%.  That means that the probability of a person being falsely accused of rape is 5%.

How can you ignore the premise of your question and play statistical games with this really, really basic fact?

Toque.

Toque, with respect. Please read the professors comment "the two numbers describe two different events."

One is false allegation, the other is guilty, they are NOT the same thing.

I seriously don't understand why it is so difficult for people to wrap their heads around this.

Also absurd? I have to say, you guys are the absurd ones. When one has two allegations against them, it goes up to 70% precisely because you would then replace the 5% population to 26%. This is just how the math works.

Which is why the legal system strongly leans on a PATTERN of behavior. Re: multiple allegation => much higher chance of being guilty.

How can we get this clear to you?  You're mashing things together you shouldn't without any clear reason why.

Probability of a randomly chosen man being a rapist:  5%
Average rapes per rapist: 6

So in a population of 1000 mean, there are 50 rapists who commit 300 rapes.

Of those rapes, only a fraction are reported.  Statistically, according to certain sources, it's about 30%, or 90 in our case. (This sounds high to me, but it's the one I pulled from wikipedia).

So the police are going to get about 94 rape reports, 4 of which are false.  (There's your 5% false reporting rate)

Are you with us so far?

So in a population of 1000, 90 accusations will be real and an additional 4 will be made falsely.

You odds of being falsely accused, as a random man, are about 4 out of 1000.

Toque.

Kris

  • Walrus Stache
  • *******
  • Posts: 7335
Re: Statistics update
« Reply #24 on: October 05, 2018, 01:48:46 PM »
I already know the answer. 

lol ok , then what's the harm in asking an expert that is impartial to judge it as I had done?

Reminds me of how BK didn't want to have a full FBI investigation.

By actual problem you mean exchange words crime with rape? You didn't answer my question, what difference does the kind of crime make? Surely by your logic, the false accusation rate would be much higher when it comes to actual criminals and lower regarding the general population anyway no matter the crime??

LOL oh, brother.

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #25 on: October 05, 2018, 01:53:38 PM »
What are you going to do next?  Re-apply the 26% to your original filter?

Then come out with a 75% false accusation rate?  Do you understand the fundamental, mathematical absurdity of what you're doing?

Toque,

Anistropy's math is correct for infection rates, which is a different problem unrelated to sexual assault allegations.

The reason a 5% false accusation rate can be twisted into a 75% false accusation rates is that he's artificially specified that only a tiny fraction of the population is guilty, and that everyone is equally accused whether they are guilty or not.  This scenario results in millions of people who are not guilty being falsely accused.  Then he lumps all of those falsely accused people into the same pool as the correctly accused people, and the guilty ones now make up a minority of the total population of accused people. 

It's just math slight of hand.  By assuming that everyone is accused, and that the real guilty rate is low, he's created an artificially high number of false allegations to dilute the pool and shrink the percentage of guilty among the accused. 

It has no bearing whatsoever on whether or not a person who is accused of sexual assault is guilty, his claims to the contrary, because he's fundamentally misconstructed the problem.  It's a confusing situation though, so I would be forgiving if he weren't then using that sleight of hand to accuse millions of sexual assault survivors of being liars.  That last part kind of pisses me off though.


anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #26 on: October 05, 2018, 01:53:57 PM »
Alright, I can see we are talking past each other completely. I will stop arguing now as it's pretty much pointless.

I am curious, how many people here took 1st year/2nd year stat courses in college?

former player

  • Walrus Stache
  • *******
  • Posts: 8822
  • Location: Avalon
Re: Statistics update
« Reply #27 on: October 05, 2018, 01:55:18 PM »
Alright, I can see we are talking past each other completely. I will stop arguing now as it's pretty much pointless.

I am curious, how many people here took 1st year/2nd year stat courses in college?

I am curious, why are you ignoring my posts on this thread?

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #28 on: October 05, 2018, 01:56:56 PM »
Alright, I can see we are talking past each other completely. I will stop arguing now as it's pretty much pointless.

I am curious, how many people here took 1st year/2nd year stat courses in college?

I am curious, why are you ignoring my posts on this thread?

I am curious, why are you refusing to ask the stats prof the real question you have?

These are all rhetorical questions, right?

Kris

  • Walrus Stache
  • *******
  • Posts: 7335
Re: Statistics update
« Reply #29 on: October 05, 2018, 01:57:02 PM »
Alright, I can see we are talking past each other completely. I will stop arguing now as it's pretty much pointless.

I am curious, how many people here took 1st year/2nd year stat courses in college?

You do realize -- because they have already told you -- that the people in this thread who are pointing out your problem have advanced knowledge of statistics, right?


anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #30 on: October 05, 2018, 02:02:42 PM »
Sol,

for the last time, what difference does changing the word rape to crime make? regardless of the crime, if your logic holds, the guilty would be accused way more than the innocent. So what difference does it make? Even if Dabnasty's 0.1% idea turns out to be right. You realize it means: "An innocent person has a 0.1% chance of being accused of rape". VS "when a person is accused of rape, there is a 26% chance he is guilty."

The problem is we can not possibly know if the label "innocent" is applicable at the time of accusation.

Formerplayer, sorry I focused on Sol too much. But it's essentially the same story over and over again. The Stats describe different events.

Kris, and where exactly did you see that mentioned other than Sol's brief rundown of his background? Provide a quote please?
« Last Edit: October 05, 2018, 02:05:37 PM by anisotropy »

former player

  • Walrus Stache
  • *******
  • Posts: 8822
  • Location: Avalon
Re: Statistics update
« Reply #31 on: October 05, 2018, 02:05:24 PM »
Sol,

for the last time, what difference does changing the word rape to crime make? regardless of the crime, if your logic holds, the guilty would be accused way more than the innocent. So what difference does it make? Even if Dabnasty's 0.1% idea turns out to be right. You realize it means: "An innocent person has a 0.1% chance of being accused of rape". VS "when a person is accused of rape, there is a 26% chance he is guilty."

Formerplayer, sorry I focused on Sol too much. But it's essentially the same story over and over again. The Stats describe different events.

No, they describe the same events (same number accused, same number not accused, but mine come to a different result because they correctly apply the 5% to the accused number not the unaccused number.


Really, this isn't a stats problem, it's a basic logic and comprehension problem.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #32 on: October 05, 2018, 02:07:17 PM »
former player,

I am going to quote the professor here
"The 26% and 5% are conditional probabilities for two different events, and either one may (in general) be larger or smaller than the other. "

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #33 on: October 05, 2018, 02:13:21 PM »
for the last time, what difference does changing the word rape to crime make?

It matters because you still didn't ask the correct question.  You asked about the chances of a random person "actually being a criminal", akin to a random person actually being infected.  This has nothing to do with whether or not an accused rapist is actually a rapist, because you're not asking about the chance a random person is actually rapist, you're asking about the chance that an accused rapist is a rapist.

An accused rapist is not a random person.  You've moved it from a precondition to a conditional probability, and then tried to draw conclusions about that one person based on conclusions about everyone in the population.  But as we keep repeating for you, you have incorrectly assumed that everyone in the population is equally randomly accused, when in reality real rapists get accused of rape far more often than random people do. 

In fact, your own stats report that 95 of accused rapists are guilty and 5% are innocent, right?  You refused to answer this question above.  If Brett Kavanaugh is accused of sexual assault, do you think there is a 5% chance he is innocent or a 75% chance he is innocent? 

Your answer basically depends on whether or not you think everyone in the population is randomly accused of sexual assault.  If you think everyone is randomly accused, then the likelihood of an accused person being innocent is high.  If you accept your own preconditional fp that only 5% of allegations are false, then the likelihoood of an accused person being innocent is low.  In order to get from one to the other, you have to seriously misunderstand how this problem is set up.

former player

  • Walrus Stache
  • *******
  • Posts: 8822
  • Location: Avalon
Re: Statistics update
« Reply #34 on: October 05, 2018, 02:13:41 PM »
former player,

I am going to quote the professor here
"The 26% and 5% are conditional probabilities for two different events, and either one may (in general) be larger or smaller than the other. "


Only one of those events relates to the real world problem you are proposing as respects accused rapists, though.  And it's not the one which ends up with 48 innocent men out of 1000 accused of rape.  I mean, 48 out of 1000 is significantly higher than rates of cancer diagnosis - I think we all would have noticed that.

PathtoFIRE

  • Pencil Stache
  • ****
  • Posts: 873
  • Age: 44
  • Location: San Diego
Re: Statistics update
« Reply #35 on: October 05, 2018, 02:27:42 PM »
Anisotropy, this seems simple to me that you are wrong. Now I'm not familiar with the actual statistics you are talking about, but sticking with your hypotheticals:

False positive rate (FP) = When testing a given population, what % of normal subjects test positive
Positive predictive value (PPV) = Given a positive result, what % are true positives (True positives / Total positives)

When someone tells me that the false rape allegation rate is somewhere around 5%, which of those two statistical values are they referring to? The false positive rate, or 1 - PPV. Given the way the stat is being stated, that 5% of rape allegations are false, or conversely that 95% of rape allegations are true, that sounds like 1 - PPV to me, not FP outright.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #36 on: October 05, 2018, 02:30:26 PM »
It matters because you still didn't ask the correct question.  You asked about the chances of a random person "actually being a criminal", akin to a random person actually being infected.  This has nothing to do with whether or not an accused rapist is actually a rapist, because you're not asking about the chance a random person is actually rapist, you're asking about the chance that an accused rapist is a rapist.

An accused rapist is not a random person.  You've moved it from a precondition to a conditional probability, and then tried to draw conclusions about that one person based on conclusions about everyone in the population.  But as we keep repeating for you, you have incorrectly assumed that everyone in the population is equally randomly accused, when in reality real rapists get accused of rape far more often than random people do. 

In fact, your own stats report that 95 of accused rapists are guilty and 5% are innocent, right?  You refused to answer this question above.  If Brett Kavanaugh is accused of sexual assault, do you think there is a 5% chance he is innocent or a 75% chance he is innocent? 

Your answer basically depends on whether or not you think everyone in the population is randomly accused of sexual assault.  If you think everyone is randomly accused, then the likelihood of an accused person being innocent is high.  If you accept your own preconditional fp that only 5% of allegations are false, then the likelihoood of an accused person being innocent is low.  In order to get from one to the other, you have to seriously misunderstand how this problem is set up.

I have said many times, based on two allegations, BK's likelihood of being guilty is 70%, I don't understand why you keep saying I am refusing to answer this. I have done so plenty of times.

Quote
you're not asking about the chance a random person is actually rapist, you're asking about the chance that an accused rapist is a rapist.

This is right, this is how inference works, you take what is known: population composition, fp, fn, and work out the likelihood of "Given X what is Y". In my case submitted to the professor: "Odds of an accused criminal being actually a criminal". We are just running in circles at this point.

What I calculated is the odds of an accused person (of any crime, given appropriate fp and fn and comp), being actually guilty of the crime accused of. That is all.

The framing is sound, there is no logical error I had committed. Once again, If you are 100% sure you are correct, what's the harm in seeking an impartial expert to judge?

pathtofire,
The key is to recognize 26% and 5% describe two different events. one is false allegation rate 5% given an allegation; the other is actually being guilty when accused by a single allegation 26%.



TO all that disagree with me, since Sol doesn't want to seek an impartial expert to be the judge, how about one of you take it up and talk to an expert in Bayesian Statistics and computations at your local college like I had done. This can be settled really easily.
« Last Edit: October 05, 2018, 02:32:12 PM by anisotropy »

Davnasty

  • Magnum Stache
  • ******
  • Posts: 2793
Re: Statistics update
« Reply #37 on: October 05, 2018, 02:30:33 PM »
Sol,

for the last time, what difference does changing the word rape to crime make? regardless of the crime, if your logic holds, the guilty would be accused way more than the innocent. So what difference does it make? Even if Dabnasty's 0.1% idea turns out to be right. You realize it means: "An innocent person has a 0.1% chance of being accused of rape". VS "when a person is accused of rape, there is a 26% chance he is guilty."

The problem is we can not possibly know if the label "innocent" is applicable at the time of accusation.

Formerplayer, sorry I focused on Sol too much. But it's essentially the same story over and over again. The Stats describe different events.

Kris, and where exactly did you see that mentioned other than Sol's brief rundown of his background? Provide a quote please?

Yes, but you got 26% using 5% as the input. If you used .1% as the input your result would be 94.4% chance he is guilty.

I should also point out some glaring mistakes in my math like using the whole US population rather than male only (accidentally) and including every year of the average male life when not all ages will be accused at the same rate (intentionally, cause picking an age felt weird)

And I'll also point out that 94.4% is not accurate because there is no guarantee that all accusations not proven false are in fact true. Most accusations are proven neither true nor false which is something that's been completely glazed over throughout this discussion. (I hope I'm not the one misunderstanding this part)

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #38 on: October 05, 2018, 02:35:08 PM »
Sol,

for the last time, what difference does changing the word rape to crime make? regardless of the crime, if your logic holds, the guilty would be accused way more than the innocent. So what difference does it make? Even if Dabnasty's 0.1% idea turns out to be right. You realize it means: "An innocent person has a 0.1% chance of being accused of rape". VS "when a person is accused of rape, there is a 26% chance he is guilty."

The problem is we can not possibly know if the label "innocent" is applicable at the time of accusation.

Formerplayer, sorry I focused on Sol too much. But it's essentially the same story over and over again. The Stats describe different events.

Kris, and where exactly did you see that mentioned other than Sol's brief rundown of his background? Provide a quote please?

Yes, but you got 26% using 5% as the input. If you used .1% as the input your result would be 94.4% chance he is guilty.

I should also point out some glaring mistakes in my math like using the whole US population rather than male only (accidentally) and including every year of the average male life when not all ages will be accused at the same rate (intentionally, cause picking an age felt weird)

And I'll also point out that 94.4% is not accurate because there is no guarantee that all accusations not proven false are in fact true. Most accusations are proven neither true nor false which is something that's been completely glazed over throughout this discussion. (I hope I'm not the one misunderstanding this part)

when you use 0.1% as fp, what you are stating is that the false allegation rate is 0.1%. This is no longer saying 0.1% of the population will be randomly accused of rape in their lifetime. This is saying when someone makes an accusation towards anyone, there is 0.1% chance the accusation is false.

PathtoFIRE

  • Pencil Stache
  • ****
  • Posts: 873
  • Age: 44
  • Location: San Diego
Re: Statistics update
« Reply #39 on: October 05, 2018, 02:36:14 PM »
pathtofire,
The key is to recognize 26% and 5% describe two different events. one is false allegation rate 5% given an allegation; the other is actually being guilty when accused by a single allegation 26%.

Correct me if I'm wrong, since I was one of those ppl on the Kavanaugh thread trying to ignore the statistics sidebar, but didn't you derive the 26% figure? That was not an empirical finding of any sort reported in the media or literature, correct? That's why I used PPV, that is exactly the statistic we use in medicine when we want to know what percent of positive results are true positives (and subtracting PPV from 1 gives you the percent of false positives, which is what we're interested in your hypothetical). To my reading, you are literally saying "the PPV is 5%" and then saying "the PPV is 26%". I think you think the 5% figure is a false positive rate, but it's not, it's the 1-PPV regarding all rape allegations in the studies that again, I have not actual read.
« Last Edit: October 05, 2018, 02:39:41 PM by PathtoFIRE »

shenlong55

  • Pencil Stache
  • ****
  • Posts: 528
  • Age: 41
  • Location: Kentucky
Re: Statistics update
« Reply #40 on: October 05, 2018, 02:41:44 PM »
pathtofire,
The key is to recognize 26% and 5% describe two different events. one is false allegation rate 5% given an allegation; the other is actually being guilty when accused by a single allegation 26%.

Correct me if I'm wrong, since I was one of those ppl on the Kavanaugh thread trying to ignore the statistics sidebar, but didn't you derive the 26% figure? That was not an empirical finding of any sort reported in the media or literature, correct? That's why I used PPV, that is exactly the statistic we use in medicine when we want to know what percent of positive results are true positives (and subtracting PPV from 1 gives you the percent of false positives, which is what we're interested in your hypothetical). To my reading, you are literally saying "the PPV is 5%" and then saying "the PPV is 26%". I think you think the 5% figure is a false positive rate, but it's not, it's the 1-PPV regarding all rape allegations in the studies that again, I have not actual read.

+1

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #41 on: October 05, 2018, 02:54:40 PM »
pathtofire,
The key is to recognize 26% and 5% describe two different events. one is false allegation rate 5% given an allegation; the other is actually being guilty when accused by a single allegation 26%.

Correct me if I'm wrong, since I was one of those ppl on the Kavanaugh thread trying to ignore the statistics sidebar, but didn't you derive the 26% figure? That was not an empirical finding of any sort reported in the media or literature, correct? That's why I used PPV, that is exactly the statistic we use in medicine when we want to know what percent of positive results are true positives (and subtracting PPV from 1 gives you the percent of false positives, which is what we're interested in your hypothetical). To my reading, you are literally saying "the PPV is 5%" and then saying "the PPV is 26%". I think you think the 5% figure is a false positive rate, but it's not, it's the PPV regarding all rape allegations in the studies that again, I have not actual read.

the 26% was derived using 2-10% false allegation rate, which I treated as fp. 2/3 as fn as estimate 2/3 of crimes are unreported. And population comp of 5% rapists in a population. These inputs are empirical numbers based on lits I cited.

In terms of PPV, PPV = True positive / predicted total positive. So the 26% would be PPV.

The 5% is the false positive rate. FPR = false positive / (false positive + true negatives), or FPR = false positive / (sum of condition negatives). As condition negatives include both false positive and true negatives.

1-PPV is actually the False discovery rate, ie, False positive / (false negatives + true negatives), or FDR = false positive / (sum of condition positives). This is not the fp.

gaja

  • Handlebar Stache
  • *****
  • Posts: 1681
Re: Statistics update
« Reply #42 on: October 05, 2018, 02:56:03 PM »
Alright, I can see we are talking past each other completely. I will stop arguing now as it's pretty much pointless.

I am curious, how many people here took 1st year/2nd year stat courses in college?

I did. And I have also taught math and statistics in high school and junior high. Your error is typical of student who misunderstand the text and therefore get the wrong end result, and get really frustrated for a low score on the test, since their calculations are correct.

Maybe a simpler language will help?

"1000 men live in Oaktown. 5 % of these are accused of rape in 1983. Of these accused rapists, 5 % are innocent. 65 % of the rapists in Oaktown are never accused.
a) How many men in Oaktown are accused of rape? (answer: 50)
b) How many of the accused men are innocent? (answer: 2-3)
c) What is the probability that a man accused of rape in Oaktown is guilty? (answer: 95 %)
d) What is the probability that a random man in Oaktown is a rapist? (answer: 13.6 %)
3) What is the probability that a random man in Oaktown will be innocently accused of rape? (answer: 0.25 %)"


anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #43 on: October 05, 2018, 03:03:38 PM »
I noticed there's an error in B, instead of b) How many of the accused men are innocent? (answer: 2-3).

It should read: how many of the accused rapist are innocent, because you explicitly stated Of these accused rapists, 5 % are innocent.

But notice, all of your statements are different from:

Given a man accused of rape, what are the odds of him being innocent.

This is different from yours:

What is the probability that a random man in Oaktown will be innocently accused of rape? (answer: 0.25 %). In your case, you are assuming him to innocent. Rewording: Given a man in oaktown is innocent, what are the odds of him being accused of rape.

In my statement, it's simply, Given a man (without being known innocent or guilty) accused of rape, what are the odds of him being innocent.

Given a man in oaktown is innocent, what are the odds of him being accused of rape.
vs
Given a man (without being known innocent or guilty) accused of rape, what are the odds of him being innocent.

Very different events.


« Last Edit: October 05, 2018, 03:19:10 PM by anisotropy »

PathtoFIRE

  • Pencil Stache
  • ****
  • Posts: 873
  • Age: 44
  • Location: San Diego
Re: Statistics update
« Reply #44 on: October 05, 2018, 03:11:38 PM »
To continue my thought, a false positive rate is dependent on the characteristics of the test itself and the cutoff values. PPV is influenced by the population.

So you have a screening test, and you set your cutoff values such that you get a false positive rate of 5%.

1000 healthy volunteers --> test applied = 950 negative results (true negatives), 50 positive results (false positives)
False positive rate = # of false positives / (# of false positives + # of true negatives) [restated: FP/(FP+TN)]
= 5%

Then you test 100 patients with the disease and get these results

100 known patients --> test applied = 90 positive results (true positives), 10 negative (false negatives)
False negative rate = false negatives / (false negatives + true positives) [restated: FN/(FN+TP)]
=10%

Disease prevalence is 5%.

Now you test two sets of populations, one a random cohort, and the other a mix of patients suspected to have disease.

Random 1000 people (950 should be healthy, 50 should have disease)
                      Total        Positive           Negative
Healthy ppl    950          48 (FP)            902 (TN)
Disease         50            45 (TP)            5 (FN)

PPV = TP/(TP+FP) = 45 / (45+48) = 45 / 93 = 48%     [1 - PPV = 53%]
NPV = TN/(TN+FN) = 902 / (902+4) = 902 / 906 = 99.5%

Now instead test 100 people suspected of having disease based on symptoms, but only half do.
                       Total        Positive          Negative
Disease           50           45 (TP)           5 (FN)
No disease      50           3 (FP)             47 (TN)

PPV = 45 / (45+3) = 45/48 = 94% (1 - PPV = 6%)
NPV = 47 / (47+5) = 47/52 = 90%


What everyone else is saying is that we are talking about the second situation. Given any one allegation within the cohort of all allegations made, what is the positive predictive value, and others are saying that the research says 95% (5% being 1-PPV). If you decided to take the methods that those studies used to determine a true allegation from a false one (testimony, police reports, convictions, etc), and then applied those while accusing a large random group of people, then yes, the PPV of those methods would go down, like in my first example, and I think that is what you are actually arguing. But everyone else is saying that these actual studies are not talking about a random 1000 strangers, but instead are looking at the actual cases of allegations, and from there they determined that only 5% could be called false.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #45 on: October 05, 2018, 03:22:07 PM »
pathtofire,

Your arguments are similar to gaja's. IF we knew which group we are dealing with given each random person, then yes, you folks would be right.

The problem is, we don't. Does this make sense? I believe this is where most of the confusion is coming from.

When you calculate the odds when accused, you already assumed the individual in either of the subsets.

I am saying, without knowing with subset the individual is in, what are the odds of the him being guilty when accused.

Think about it, if we knew the person was in the rapist group, then ya, the odds would be super high he's actually guilty. 
If we knew the person was in the innocent group, then ya, the odds would be very low he's actually guilty. 0.01% in dabn's case.

But the problem is we dont know. What I calculated was this, Given a person is accused (and we have no idea which group he's in), what are his odds of being guilty.

And EVERYONE's error here is assuming the person is in the rapist group to begin with.
« Last Edit: October 05, 2018, 03:32:11 PM by anisotropy »

Glenstache

  • Magnum Stache
  • ******
  • Posts: 3493
  • Age: 94
  • Location: Upper left corner
  • FI(lean) working on the "RE"
Re: Statistics update
« Reply #46 on: October 05, 2018, 03:27:30 PM »
This is some gold-star arguing about people being wrong on the internet.

This has a lot of great elements:
1. Appeal to authority
2. Arcane tangent
3. Highly emotional content
4. Repetition of arguments
5. Poorly formed analogy

There should be more effort in including past comments in replies to make it harder to read though. There is lots of room for improvement in that category.

And the whole thing boils down to whether or not the same stats apply to a general population and the sub-population of those accused of rape.  In all of this discussion, that is the only assumption that matters. Pretty everyone else here knows how to multiply probabilities, so it isn't a question of understanding method.

PathtoFIRE

  • Pencil Stache
  • ****
  • Posts: 873
  • Age: 44
  • Location: San Diego
Re: Statistics update
« Reply #47 on: October 05, 2018, 03:31:17 PM »
pathtofire,

Your arguments are similar to gaja's. IF we knew which group we are dealing with given each random person, then yes, you folks would be right.

The problem is, we don't.

Exactly, the point is you only do the deep dive when you have a credible accusation, which is what kinda happened hear (I'd argue the dive wasn't so deep). And we don't "test" large random populations for potential sexual assault, we just don't, we wait for some signs and symptoms to develop before instituting those tests, so I'd argue that Kavanaugh's situation is nothing like my first example and what the picture you have been trying to paint, but much closer to the second, and therefore I trust the PPV and NPV of an actual investigation (again, not sure we got one here to be honest).

sol

  • Walrus Stache
  • *******
  • Posts: 8433
  • Age: 47
  • Location: Pacific Northwest
Re: Statistics update
« Reply #48 on: October 05, 2018, 03:34:24 PM »
I am saying, without knowing with subset the individual is in, what are the odds of the him being guilty when accused.

Except that one of those subsets doesn't exist.

Innocent people are not routinely accused of rape the same way that healthy people are routinely screened for infection, so the math you've presented isn't relevant.  You're solving the wrong problem.

If you remove the nonexistent subset of random people who are accused of rape, your math would make a lot more sense.  Would be easier to follow, too.

For the third (or maybe fourth?) time now, do you believe there is a 5% chance or a 75% chance that a person with a single accusations is innocent?  Because you started out saying 75% innocent, then briefly started saying 5% innocent, and now you appear to be back to 75% innocent.  Keep in mind that if you say 75% innocent, you're also saying 75% of self-identified sexual assault survivors are liars.

anisotropy

  • Pencil Stache
  • ****
  • Posts: 681
Re: Statistics update
« Reply #49 on: October 05, 2018, 03:38:51 PM »
pathtofire,

Your arguments are similar to gaja's. IF we knew which group we are dealing with given each random person, then yes, you folks would be right.

The problem is, we don't.

Exactly, the point is you only do the deep dive when you have a credible accusation, which is what kinda happened hear (I'd argue the dive wasn't so deep). And we don't "test" large random populations for potential sexual assault, we just don't, we wait for some signs and symptoms to develop before instituting those tests, so I'd argue that Kavanaugh's situation is nothing like my first example and what the picture you have been trying to paint, but much closer to the second, and therefore I trust the PPV and NPV of an actual investigation (again, not sure we got one here to be honest).

Right, I agree with these points. Except my calculation didn't imply a exhaustive test was necessary. what it meant was, without first knowing if a person is guilty or innocent what are the odds of him being guilty when accused.

You are saying, he wouldn't have been accused if he were innocent, because a guilty person is likely to be accused.

See my comparison again, these statements are consistent, and are both true:

Think about it, if we knew the person was in the rapist group, then ya, the odds would be super high he's actually guilty. 
If we knew the person was in the innocent group, then ya, the odds would be very low he's actually guilty. 0.01% in dabn's case.

But the problem is we dont know. What I calculated was this, Given a person is accused (and we have no idea which group he's in), what are his odds of being guilty.