If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Impact of transforming (scaling and shifting) random variables

Linear transformations (addition and multiplication of a constant) and their impacts on center (mean) and spread (standard deviation) of a distribution.

Want to join the conversation?

  • boggle blue style avatar for user Bryan
    I get why adding k to all data points would shift the prob density curve, but can someone explain why multiplying the data by a constant would stretch and squash the graph?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user John Smith
      Scaling a density function doesn't affect the overall probabilities (total = 1), hence the area under the function has to stay the same one.

      If you multiply the random variable by 2, the distance between min(x) and max(x) will be multiplied by 2. Hence you have to scale the y-axis by 1/2.

      For instance, if you've got a rectangle with x = 6 and y = 4, the area will be x*y = 6*4 = 24. If you multiply your x by 2 and want to keep your area constant, then x*y = 12*y = 24 => y = 24/12 = 2. Scaling the x by 2 = scaling the y by 1/2.

      If you didn't scale down your y-axis, then your cumulative probabilities will be >1, which is not possible.
      (6 votes)
  • male robot hal style avatar for user JohN98ZaKaRiA
    Why does k shift the function to the right and not upwards?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user 𝜏 Is Better Than 𝝅
      Because an upwards shift would imply that the probability density for all possible values of the random variable has increased (at all points). But this would consequently be increasing the area under the probability density function, which violates the rule that the area under any probability density function must be = 1 . Furthermore, the reason the shift is instead rightward (or it could be leftward if k is negative) is that the new random variable that's created simply has all of its initial possible values incremented by that constant k. 0 goes to 0+k. 1 goes to 1+k. 2 goes to 2+k, etc, but the associated probability density sort of just slides over to a new position without changing in its value.
      (5 votes)
  • aqualine seed style avatar for user Bryandon
    In real life situation, when are people add a constant in to the random variable
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user jessica b
      If you are a teacher grading a test, you will have a distribution of scores. You are considering giving some extra credit because there was an internet outage during the exam and no one could use a calculator for ten minutes. You want to see how much the scores will be affected by different amounts of extra credit, and want to determine what would benefit all the students the most. Extra credit is represented by a constant, k, and grades are represented by a random variable X.

      Now we have a new distribution that is Y = X + k
      (2 votes)
  • aqualine ultimate style avatar for user rdeyke
    What if you scale a random variable by a negative value? If we scale multiply a standard deviation by a negative number we would get a negative standard deviation, which makes no sense. I think you should multiply the standard deviation by the absolute value of the scaling factor instead.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user r c
      @rdeyke Let's consider a Random Variable X with mean 2 and Variance 1 (Standard Deviation also natuarally is then 1).

      The mean gets multiplied by the constant k, let's say it is -2. As originally, your mean was 2, now new mean would be -2*2 = -4

      Next comes the Variance. Variance is scaled by k squared. Hence, it would be multiplied by (-2)^2 which is 4. The Standard Deviation is always the positive root of the Variance, and hence, the SD in this case would come out to be 2.
      (2 votes)
  • ohnoes default style avatar for user Artur
    At , the graph of the variable Z is flatted because it was scaled up and must keep the same area. So how is it possible to Z have a bigger mean than X's one?
    Sal says Z(mean) = k times X(mean)
    (0 votes)
    Default Khan Academy avatar avatar for user
  • duskpin tree style avatar for user xinyuan lin
    What do the horizontal and vertical axes in the graphs respectively represent?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user Renan Ferreira
    Do the mean and standard deviation formulas for transformation apply to any probability density function or just the normal one?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      The formulas for the transformation of means and standard deviations apply generally, not just to the normal distribution. For any probability distribution, adding a constant to a random variable shifts its mean without affecting its standard deviation, while multiplying a random variable by a constant scales both its mean and standard deviation by that constant.
      (1 vote)
  • orange juice squid orange style avatar for user Vachagan G
    What does it mean adding k to the random variable X? Does it mean that we add k to all values X (i.e. possible outcomes of X)?
    (0 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user Hanaa Barakat
      I think that is a good question. I think since Y = X+k and Sal was saying that Y is also a random variable, the equation is meant to show that Y is an incremented version of X. For eg. If X was representing the amount of rain falling in April, Y would be the amount of rain in mm falling in April plus some constant number 'k', let's say the number is 2.5. Thus the normal distribution for X is shifted 2.5mm to the right. Hope this helps.
      (4 votes)
  • blobby green style avatar for user electrorazor911
    I thought we scale the variance by k, thus scaling the standard deviation by sqrt(k)
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      You're correct in noting a common confusion. When you scale a random variable X by a constant k, you scale the variance by k^2 (since variance is measured in the units squared), which means the standard deviation (the square root of variance) is scaled by k, not k.
      (1 vote)
  • leaf green style avatar for user makvik
    In the second half, when we are scaling the random variable, what happens to the Y value when you scale it by multiplying it with k?
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] Let's say that we have a random variable x. Maybe it represents the height of a randomly selected person walking out of the mall or something like that and right over here, we have its probability distribution and I've drawn it as a bell curve as a normal distribution right over here but it could have many other distributions but for the visualization sake, it's a normal one in this example and I've also drawn the mean of this distribution right over here and I've also drawn one standard deviation above the mean and one standard deviation below the mean. What we're going to do in this video is think about how does this distribution and in particular, how does the mean and the standard deviation get affected if we were to add to this random variable or if we were to scale this random variable? So let's first think about what would happen if we have another random variable which is equal to let's call this random variable y which is equal to whatever the random variable x is and we're going to add a constant. So let's say we add, so we're gonna add some constant here. I'll do a lowercase k. This is not a random variable. This is a constant. It could be the number 10. So if these are random heights of people walking out of the mall, well, you're just gonna add 10 inches to their height for some reason. Maybe you wanna figure out, well, the distribution of people's heights with helmets on or plumed hats or whatever it might be. How would that affect, how would the mean of y and the standard deviation of y relate to x? So we could visualize that. So what the distribution of y would look like. So instead of this, instead of the center of the distribution, instead of the mean here being right at this point, it's going to be shifted up by k. In fact, we can shift. The entire distribution would be shifted to the right by k in this example. Maybe k is quite large. Maybe it looks something like that. This is my distribution for my random variable y here and you can see that the distribution has just shifted to the right by k. So we have moved to the right by k. We would have moved to the left if k was negative or if we were subtracting k and so this clearly changes the mean. The mean is going to now be k larger. So we can write that down. We can say that the mean of our random variable y is equal to the mean of x, the mean of x of our random variable x plus k, plus k. You see that right over here but has the standard deviation changed? Well, remember, standard deviation is a way of measuring typical spread from the mean and that won't change. So for our random variable x, this is, this length right over here is one standard deviation. Well, that's also going to be the same as one standard deviation here. This is one standard deviation here. This is going to be the same as our standard deviation for our random variable y and so we can say the standard deviation of y, of our random variable y, is equal to the standard deviation of our random variable x. So if you just add to a random variable, it would change the mean but not the standard deviation. You see it visually here. Now, what if you were to scale a random variable? So what if I have another random variable, I don't know, let's call it z and let's say z is equal to some constant, some constant times x and so remember, this isn't, the k is not a random variable. It's just gonna be a number. It could be say the number two. Well, let's think about what would happen. So let me redraw the distribution for our random variable x. So let's see, if k were two, what would happen is is with this distribution would be scaled out. It would be stretched out by two and since the area always has to be one, it would actually be flattened down by a scale of two as well so it still has the same area. So I can do that with my little drawing tool here. Let me try to, first I'm going to stretch it out by, whoops, first actually I'll just make it shorter by a factor of two but more importantly, it is going to be stretched out by a factor of two. So let me align the axes here so that we can appreciate this. So it's going to look something like this. It's going to look something like this when you scale the random variable. This is what the distribution of our random variable z is going to look like. I'll do it in the z's color so that it's clear and so you can see two things. One, the mean for sure shifted. The mean here for sure got pushed out. It definitely got scaled up but also, we see that the standard deviations got scaled, that the standard deviation right over here of z, that this is a, this has been scaled, it actually turns out that it's been scaled by a factor of k. So this is going to be equal to k times the standard deviation of our random variable x and it turns out that our mean right over here, so let me write that too, that our mean of our random variable z is going to be equal to, that's also going to be scaled up, times or it's gonna be k times the mean of our random variable x. So the big takeaways here, if you have one random variable that's constructed by adding a constant to another random variable, it's going to shift the mean by that constant but it's not going to affect the standard deviation. If you try to scale, if you multiply one random variable to get another one by some constant then that's going to affect both the standard deviation, it's gonna scale that, and it's going to affect the mean.