I have an
xgboost model on two different servers - a test server and a production server. Each server has exactly the same data and exactly the same code, but when I apply the same model to the same data in each environment I get a slightly different result. We need the results to be identical.
I've found that the sparse matrix object that the following line returns is different on each server:
mm <- sparse.model.matrix(y ~ ., data = df.new)[,-1]
mm on the test server has
@x of length 182, whereas the
mm on the production server has
@x of length 184. Again, I've compared the
df.new from both servers and they are identical.
I've tried downgrading the
Matrix package on the production server so that the versions match, but it's still producing different results. The only idea I have left is to match the versions of every package.
Does anyone have any suggestions for what might be happening? Unfortunately I can't share the data, but if it helps, it's 227 variables of mixed types (775 when converted to sparse model matrix). A lot of the variables are mostly 0.
I don't know if it makes a difference or not, but the test server is Windows and the production server is Linux.