Warm tip: This article is reproduced from stackoverflow.com, please click

发布于 2020-04-10 16:06:39

I have an `xgboost`

model on two different servers - a test server and a production server. Each server has exactly the same data and exactly the same code, but when I apply the same model to the same data in each environment I get a slightly different result. We need the results to be identical.

I've found that the sparse matrix object that the following line returns is different on each server:

```
mm <- sparse.model.matrix(y ~ ., data = df.new)[,-1]
```

The `mm`

on the test server has `@i`

and `@x`

of length 182, whereas the `mm`

on the production server has `@i`

and `@x`

of length 184. Again, I've compared the `df.new`

from both servers and they are identical.

I've tried downgrading the `Matrix`

package on the production server so that the versions match, but it's still producing different results. The only idea I have left is to match the versions of every package.

Does anyone have any suggestions for what might be happening? Unfortunately I can't share the data, but if it helps, it's 227 variables of mixed types (775 when converted to sparse model matrix). A lot of the variables are mostly 0.

I don't know if it makes a difference or not, but the test server is Windows and the production server is Linux.

Questioner

user123965

Viewed

55

You're getting bitten by the conjunction of two problems:

(1) floating-point computations are inherently sensitive to small differences (platform, compiler, compiler settings ...)
(2) ordered factors in R use an *orthogonal polynomial* contrasts (see `?contr.poly`

, Venables and Ripley *Modern Applied Statistics with S*, or here), which involve floating-point computation.

```
dd <- data.frame(x=ordered(0:2))
> Matrix::sparse.model.matrix(~x,dd)
3 x 3 sparse Matrix of class "dgCMatrix"
(Intercept) x.L x.Q
1 1 -7.071068e-01 0.4082483
2 1 -7.850462e-17 -0.8164966
3 1 7.071068e-01 0.4082483
```

You can see that one of the entries here is close to but not exactly equal to zero. So far I haven't actually been able to come up with an example that displays a difference between the two platforms I have handy (Ubuntu Linux and MacOS), but this is almost surely the source of your problem; the nearly-zero entry is computed as *exactly* zero on one platform but not the other.

There is probably no perfect solution to this problem, but `zapsmall()`

would convert small entries to zero, and `drop0`

would convert them from explicit to implicit (structural) zero entries, so `drop0(zapsmall(mm))`

might work ...