I would like to trim the last XY characters of every 4th line. The cut off should be the different between the character count from line 4 and 2, and line 8 and 6.
For example: line 4 (29 characters) - line 2 (20 characters) = 9. So the last 9 characters of line 4 should be removed.
Input:
@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFGGGGGFFFGG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAAT
+
GFFGFEGFGFGEFDFGGEFFGGEDEGEGF
Output:
@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAAT
+
GFFGFEGFGFGEFDFGGE
Running
awk 'NR%4==0 {$0=substr($0,1,a)} NR%2==0 {a=length($0)} {print $0}' input.txt
on input.txt
yields
@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAAT
+
GFFGFEGFGFGEFDFGGE
Thanks a lot!!! This command remove by default the characters at the end of the line?
characters are only removed at every 4th line
Yeah right, but why does this command remove them starting at the end of the 4th line. Couldn't it also remove the characters starting at the beginning of the line?
no, the
substr
-method takes the entire line$0
and preservesa
characters starting from the1
st