How do I truncate a csv log file that is being used as std out pipe destination from another process without generating a _csv.Error: line contains NULL byte
error?
I have one process running rtlamr > log/readings.txt
that is piping radio signal data to readings.txt
. I don't think it matters what is piping to the file--any long-running pipe process will do.
I have a file watcher using watchdog
(Python file watcher) on that file, which triggers a function when the file is changed. The function read the files and updates a database.
Then I try to truncate readings.txt
so that it doesn't grow infinitely (or back it up).
file = open(dir_path+'/log/readings.txt', "w")
file.truncate()
file.close()
This corrupts readings.txt
and generates the error (the start of the file contains garbage characters).
I tried moving the file instead of truncating it, in the hopes that rtlamr
will recreate a fresh file, but that only has the effect of stopping the pipe.
EDIT
I noticed that the charset changes from us-ascii
to binary
but attempting to truncate the file with file = open(dir_path+'/log/readings.log', "w",encoding="us-ascii")
does not do anything.
If you truncate
a file1 while another process has it open in w
mode, that process will continue to write to the same offsets, making the file sparse. Low offsets will thus be read as 0
s.
As per x11 - Concurrent writing to a log file from many processes - Unix & Linux Stack Exchange and Can two Unix processes simultaneous write to different positions in a single file?, each process that has a file open has its own offset in it, and a ftruncate()
doesn't change that.
If you want the other process to react to truncation, it needs to have it open in a
mode.
Your approach has principal bugs, too. E.g. it's not atomic: you may (=will, eventually) truncate the file after the producer has added data but before you have read it so it would get lost.
Consider using dedicated data buffering utilities instead like buffer
or pv
as per Add a big buffer to a pipe between two commands.
1Which is superfluous because open(mode='w')
already does that. Either truncate
or reopen, no need to do both.