I have a file that can contain from 3 to 4 columns of numerical values which are separated by comma. Empty fields are defined with the exception when they are at the end of the row:
1,2,3,4,5
1,2,3,,5
1,2,3
The following table was created in MySQL:
+-------+--------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------+------+-----+---------+-------+ | one | int(1) | YES | | NULL | | | two | int(1) | YES | | NULL | | | three | int(1) | YES | | NULL | | | four | int(1) | YES | | NULL | | | five | int(1) | YES | | NULL | | +-------+--------+------+-----+---------+-------+
I am trying to load the data using MySQL LOAD command:
LOAD DATA INFILE '/tmp/testdata.txt' INTO TABLE moo FIELDS
TERMINATED BY "," LINES TERMINATED BY "\n";
The resulting table:
+------+------+-------+------+------+ | one | two | three | four | five | +------+------+-------+------+------+ | 1 | 2 | 3 | 4 | 5 | | 1 | 2 | 3 | 0 | 5 | | 1 | 2 | 3 | NULL | NULL | +------+------+-------+------+------+
The problem lies with the fact that when a field is empty in the raw data and is not defined, MySQL for some reason does not use the columns default value (which is NULL) and uses zero. NULL is used correctly when the field is missing alltogether.
Unfortunately, I have to be able to distinguish between NULL and 0 at this stage so any help would be appreciated.
Thanks S.
edit
The output of SHOW WARNINGS:
+---------+------+--------------------------------------------------------+ | Level | Code | Message | +---------+------+--------------------------------------------------------+ | Warning | 1366 | Incorrect integer value: '' for column 'four' at row 2 | | Warning | 1261 | Row 3 doesn't contain data for all columns | | Warning | 1261 | Row 3 doesn't contain data for all columns | +---------+------+--------------------------------------------------------+
This will do what you want. It reads the fourth field into a local variable, and then sets the actual field value to NULL, if the local variable ends up containing an empty string:
LOAD DATA INFILE '/tmp/testdata.txt'
INTO TABLE moo
FIELDS TERMINATED BY ","
LINES TERMINATED BY "\n"
(one, two, three, @vfour, five)
SET four = NULLIF(@vfour,'')
;
If they're all possibly empty, then you'd read them all into variables and have multiple SET statements, like this:
LOAD DATA INFILE '/tmp/testdata.txt'
INTO TABLE moo
FIELDS TERMINATED BY ","
LINES TERMINATED BY "\n"
(@vone, @vtwo, @vthree, @vfour, @vfive)
SET
one = NULLIF(@vone,''),
two = NULLIF(@vtwo,''),
three = NULLIF(@vthree,''),
four = NULLIF(@vfour,'')
;
Theoretically, I suppose - but it's all in-memory, and only holding tiny amounts of data per row, so I would image it would be infinitesimal; but you should test it if you think it might be a problem.
I really like this answer. Users can see empty strings
''
when they download a csv (usingIFNULL(Col,'')
inSELECT INTO OUTFILE
query) for excel but then uploads accept them as null vs having to deal with\N
in the csv. Thanks!for dates I used 'NULLIF(STR_TO_DATE(@date1, "%d/%m/%Y"), "0000-00-00")'
I have a csv file that contains zeros
0
that should be converted toNULL
(because it is not possible to have zero value for the data in question) and also empty strings. How to make sure that both zeros and empty strings are converted toNULL
?If the zero values and empty strings are in separate columns, then just do the above for the empty strings, and something like this for the zeros:
nullif(@vone, 0)
.