Warm tip: This article is reproduced from serverfault.com, please click

How do I calculate standard deviation of two arrays in python?

发布于 2014-10-09 15:15:17

I have two arrays: one with 30 years of observations, and one with 30 years of historical model runs. I want to calculate the standard deviation between observations and model results, to see how much the model deviates from observations. How do I go about doing this?

Edit

Here are the two arrays (Each number represents a year(1971-2000)):

obs = [ 2790.90283203  2871.02514648  2641.31738281  2721.64453125
2554.19384766  2773.7746582   2500.95825195  3238.41186523
2571.62133789  2421.93017578  2615.80395508  2271.70654297
2703.82275391  3062.25366211  2656.18359375  2593.62231445
2547.87182617  2846.01245117  2530.37573242  2535.79931641
2237.58032227  2890.19067383  2406.27587891  2294.24975586
2510.43847656  2395.32055664  2378.36157227  2361.31689453  2410.75
2593.62915039]

model = [ 2976.01928711  3353.92114258  3000.92700195  3116.5078125   2935.31787109
2799.75805664  3328.06225586  3344.66333008  3318.31689453
3348.85302734  3578.70800781  2791.78198242  4187.99902344
3610.77124023  2991.984375    3112.97412109  4223.96826172
3590.92724609  3284.6015625   3846.34936523  3955.84350586
3034.26074219  3574.46362305  3674.80175781  3047.98144531
3209.56616211  2654.86547852  2780.55053711  3117.91699219
2737.67626953]    
Questioner
Stratix
Viewed
0
Falko 2014-10-10 00:02:23

You want to compare two signals, e.g. A and B in the following example:

import numpy as np

A = np.random.rand(5)
B = np.random.rand(5)

print "A:", A
print "B:", B

Output:

A: [ 0.66926369  0.63547359  0.5294013   0.65333154  0.63912645]
B: [ 0.17207719  0.26638423  0.55176735  0.05251388  0.90012135]

Analyzing individual signals

The standard deviation of each single signal is not what you need:

print "standard deviation of A:", np.std(A)
print "standard deviation of B:", np.std(B)

Output:

standard deviation of A: 0.0494162021651
standard deviation of B: 0.304319034639

Analyzing the difference

Instead you might compute the difference and apply some common measure like the sum of absolute differences (SAD), the sum of squared differences (SSD) or the correlation coefficient:

print "difference:", A - B
print "SAD:", np.sum(np.abs(A - B))
print "SSD:", np.sum(np.square(A - B))
print "correlation:", np.corrcoef(np.array((A, B)))[0, 1]

Output:

difference: [ 0.4971865   0.36908937 -0.02236605  0.60081766 -0.2609949 ]
SAD: 1.75045448355
SSD: 0.813021824351
correlation: -0.38247081