Announcement

Collapse
No announcement yet.

Lori does some math.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lori does some math.

    My next math class doesn't start until next week, so I've been keeping myself busy by teaching myself a little statistics and probability. (I'm not sure why, since said math class is linear algebra.) Anywho, I ran into a bit of a pickle and I'm not sure what the problem is. So I figured the math nerds at Poly might be able to help.

    I know that, for example, if a particular area usually gets rain once every ten days, then the odds of it not raining at all for a week are .9^7 = 48%, and thus the odds it rains at least once are 52%. I also know that if the odds aren't constant per day, I can just multiply repeatedly. So if Monday-Friday has a 10% chance of rain, and Saturday-Sunday has a 20% chance, I can do .9^5 * .8^2.

    Then I was thinking, what if the length of time between days of rain follows a normal distribution? So if the mean length is 5 days and the standard deviation is 1 day, then I can plug those numbers into the normal probability density function and come up with the odds that the length of time between rains is x days. The odds it rains on any particular day are small, but I know that you can add probabilities of the normal function. So the odds that it rains after 5 days are going to sum to 50%.

    But if I then take the individual probabilities for a given day (based on the normal function, mean, and standard deviation), subtract them from 1 and multiply them together as in the first section, I get a totally different answer. It comes out to a 43% chance that it doesn't rain after 5 days.

    So, obviously, I'm approaching this problem from two different places and I wouldn't expect the math to come out the same, except that I feel like I'm asking the same question with both techniques. So why are the answers different?

    As an aside, this isn't the actual problem I was doing. The problem I was doing is significantly more depressing but also involves a lot more terms. I kind of thought that having a lot of terms would smooth it out because the density function is continuous, but it actually diverges significantly more than my sample problem here does. So, what's the deal?
    Click here if you're having trouble sleeping.
    "We confess our little faults to persuade people that we have no large ones." - François de La Rochefoucauld

  • #2
    If I understand correctly, you've added a variable to the chance of rain, based on a weather cycle model. So if you were to plot the chance of rain over the x and y, you'd have a sort of sinusoidal curve connecting dates (peak rain chances and dry spell troughs) running out to infinity? Sorry that this isn't helpful, I'm a little high and just trying to find my bearings. I suspect that you could have made an off by one error when you switched around, or something else. Without a full set of numbers it's hard for me to see what's happening exactly.
    John Brown did nothing wrong.

    Comment


    • #3
      I like cats.
      DISCLAIMER: the author of the above written texts does not warrant or assume any legal liability or responsibility for any offence and insult; disrespect, arrogance and related forms of demeaning behaviour; discrimination based on race, gender, age, income class, body mass, living area, political voting-record, football fan-ship and musical preference; insensitivity towards material, emotional or spiritual distress; and attempted emotional or financial black-mailing, skirt-chasing or death-threats perceived by the reader of the said written texts.

      Comment


      • #4
        Originally posted by Felch View Post
        If I understand correctly, you've added a variable to the chance of rain, based on a weather cycle model. So if you were to plot the chance of rain over the x and y, you'd have a sort of sinusoidal curve connecting dates (peak rain chances and dry spell troughs) running out to infinity? Sorry that this isn't helpful, I'm a little high and just trying to find my bearings. I suspect that you could have made an off by one error when you switched around, or something else. Without a full set of numbers it's hard for me to see what's happening exactly.
        I don't think I'm making a dumb arithmetic error, because I get the same kind of error no matter the details of the problem. I think there's something fundamental that I'm missing about what questions I'm trying to ask/answer, but I don't know what that is.

        Originally posted by Colon™ View Post
        I like cats.
        So do I!
        Click here if you're having trouble sleeping.
        "We confess our little faults to persuade people that we have no large ones." - François de La Rochefoucauld

        Comment


        • #5
          If you're just taking the (day 1) (day 2) (day 3) ... probabilities, you're missing something: the probability that it might have _already_ rained. In other words, your first statement (length between rains) only considers a single instance of rain; but say it's 25% chance to rain 3 days apart (ignoring chance of 1,2,4 for this example). OK, wonderful; but now at day 5, you have multiple probabilities - .75*(chance of rain apart being 5 days) + .25*(chance of rain apart being 2 days).

          Really of course you have a lot of probabilities - day 1 is simple (chance=1), but day 2 is (chance of 1)*(chance of 1) + (1-chance of 1)(chance of 2); day 3 is (chance of 1)*(chance of 2) + (1-chance of 1)*( (chance of 2*chance of 1)*(chance of 1) + (1-chance of 2)*(chance of 3)) ... etc. You can't just ignore that portion of things. I imagine there's an integral of some sort that would help you solve this, but I didn't get far enough in stat to learn that stuff (or at least, remember it).

          Your eventual problem was that you figured a .57 chance of it raining - which is wrong, because there was a .43 chance of it raining exactly once and a .07 chance of it raining twice. (Or more accurately, .46 once, .04 twice, .01 three times, etc.)
          <Reverend> IRC is just multiplayer notepad.
          I like your SNOOPY POSTER! - While you Wait quote.

          Comment


          • #6
            That sounds ugly. I know that if the probabilities are constant over time, you can do that all with factorials. (number of days)! / ((days it rains)!(days it doesn't rain)!) * (p of raining)^(days it rains) * (p of not raining)^(days it doesn't rain) and then add up your terms for all the days you want. Which is messy enough (because excel doesn't like doing factorials bigger than 150, and excel is the only tool I have at work that can do this quickly). Hm.
            Click here if you're having trouble sleeping.
            "We confess our little faults to persuade people that we have no large ones." - François de La Rochefoucauld

            Comment


            • #7
              Why don't you have better tools?

              R is not hard to learn and very useful for this sort of thing.
              <Reverend> IRC is just multiplayer notepad.
              I like your SNOOPY POSTER! - While you Wait quote.

              Comment


              • #8
                Because my job has nothing to do with rainfall probabilities...
                Click here if you're having trouble sleeping.
                "We confess our little faults to persuade people that we have no large ones." - François de La Rochefoucauld

                Comment


                • #9
                  R at least is free
                  <Reverend> IRC is just multiplayer notepad.
                  I like your SNOOPY POSTER! - While you Wait quote.

                  Comment


                  • #10
                    Just dump it into excel and do a best fit line. That's 90% of the statistics you'll ever need.
                    “As a lifelong member of the Columbia Business School community, I adhere to the principles of truth, integrity, and respect. I will not lie, cheat, steal, or tolerate those who do.”
                    "Capitalism ho!"

                    Comment


                    • #11
                      Lori -there are a number of errors of understanding in your op.

                      1) "you can add probabilities of a normal distribution": no idea what this is supposed to mean
                      2) the distribution function you have is nonsensical - there is a finite probability that the time between rainy days is negative
                      3) assuming you truncate the distribution at 0, the probability of rain on day x is dependent on the probability of rain on day 1, 2, 3,...,x-1 (specifically, the conditional probability of rain increases as the amount of time since the last rainy day increases). But the formula you applied (product of 1 minus the probabilities for each day) only applies to independent random variables
                      12-17-10 Mohamed Bouazizi NEVER FORGET
                      Stadtluft Macht Frei
                      Killing it is the new killing it
                      Ultima Ratio Regum

                      Comment


                      • #12
                        There's also the problem that you're applying a continuous distribution (normal) to a problem which you are then applying a discrete formula (product of probabilities). But you could fix that aspect pretty easily (the product formula generalizes to the exponential of an integral or you can discretize the cumulative distribution function of the normal distribution). But the biggest mistake is (3) in my post above.
                        12-17-10 Mohamed Bouazizi NEVER FORGET
                        Stadtluft Macht Frei
                        Killing it is the new killing it
                        Ultima Ratio Regum

                        Comment


                        • #13
                          Originally posted by KrazyHorse View Post
                          Lori -there are a number of errors of understanding in your op.

                          1) "you can add probabilities of a normal distribution": no idea what this is supposed to mean
                          I don't have any of the language here because my knowledge of statistics/probability comes from googling things at random and playing boardgames and D&D. What I'm trying to say is that if a single event occurs, and there are a range of possible outcomes, you can add the probabilities of that range together. So, from my experiences, the odds of rolling 4+ on a d6 are .5 because 1/6+1/6+1/6 = .5. My understanding was this applies to the bell curve as well, which is part of where the three sigma rule comes from.

                          2) the distribution function you have is nonsensical - there is a finite probability that the time between rainy days is negative
                          I'm not sure what this means. Are you pointing out that it's equally likely for the "next" day of rain to be in the past or the future? Hm.

                          3) assuming you truncate the distribution at 0, the probability of rain on day x is dependent on the probability of rain on day 1, 2, 3,...,x-1 (specifically, the conditional probability of rain increases as the amount of time since the last rainy day increases). But the formula you applied (product of 1 minus the probabilities for each day) only applies to independent random variables
                          Okay. So this is the fundamental problem. I'm trying to mix things that can't be mixed.

                          Originally posted by KrazyHorse View Post
                          There's also the problem that you're applying a continuous distribution (normal) to a problem which you are then applying a discrete formula (product of probabilities). But you could fix that aspect pretty easily (the product formula generalizes to the exponential of an integral or you can discretize the cumulative distribution function of the normal distribution). But the biggest mistake is (3) in my post above.
                          Thanks, KH. So I suppose my question is... how do you go about predicting the next time it's going to rain if you have some data about how often it rains?
                          Click here if you're having trouble sleeping.
                          "We confess our little faults to persuade people that we have no large ones." - François de La Rochefoucauld

                          Comment


                          • #14
                            Originally posted by Lorizael View Post
                            I don't have any of the language here because my knowledge of statistics/probability comes from googling things at random and playing boardgames and D&D. What I'm trying to say is that if a single event occurs, and there are a range of possible outcomes, you can add the probabilities of that range together. So, from my experiences, the odds of rolling 4+ on a d6 are .5 because 1/6+1/6+1/6 = .5. My understanding was this applies to the bell curve as well, which is part of where the three sigma rule comes from.
                            Technically it's because you are rolling a single dice and you are looking for 3/6 of the possible outcomes.

                            The chance of rolling a 6 if you roll 3 dice is 1/6 each time, but the probability that one of 3 dice will roll a 6 isn't 1/2

                            Similarly the chance of getting 4+ on both dice if you roll 2d6 is 9/36 not 1/4.
                            Jon Miller: MikeH speaks the truth
                            Jon Miller: MikeH is a shockingly revolting dolt and a masturbatory urine-reeking sideshow freak whose word is as valuable as an aging cow paddy.
                            We've got both kinds

                            Comment


                            • #15
                              Originally posted by Lorizael View Post
                              Thanks, KH. So I suppose my question is... how do you go about predicting the next time it's going to rain if you have some data about how often it rains?
                              Well, given my experience with weather forecasters, you don't
                              Indifference is Bliss

                              Comment

                              Working...
                              X