Warm tip: This article is reproduced from serverfault.com, please click

awk bash cut fastq

How to trim every nth line?

发布于 2020-12-17 13:11:58

i would like to cut off the first 9 characters of each 4th line. I could use cut -c 9, but i don't know how to select only every 4th line, without loosing the remaining lines.

Input:

@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFGGGGGFFFGG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAATGG
+
GFFGFEGFGFGEFDFGGEFFGGEDEGEGF

Output:

@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FGFFGFFGFFGGGGGFFFGG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAATGG
+
FGEFDFGGEFFGGEDEGEGF

Questioner

gnikixam

Viewed

0

RavinderSingh13 2020-12-17 21:28:19

Could you please try following, written and tested with shown samples in GNU awk.

awk 'FNR%4==0{print substr($0,10);next} 1' Input_file

OR as per @tripleee's suggestion(in comments) try:

awk '!(FNR%4) { $0 = substr($0, 10) }1' Input_file

Explanation: Adding detailed explanation for above.

awk '                   ##Starting awk program from here.
FNR%4==0{               ##Checking condition if this line number is fully divided by 4(every 4th line).
  print substr($0,10)   ##Printing line from 10th character here.
  next                  ##next will skip all further statements from here.
}
1                       ##1 will print current Line.
' Input_file            ##Mentioning Input_file name here.

tripleee 2020-12-17 13:27:08

Maybe even refactor down to awk '!(FNR%4) { $0 = substr($0, 10) }1'

gnikixam 2020-12-17 13:28:36

Worked perfectly to resolve the first issue, thanks!!!

RavinderSingh13 2020-12-17 13:30:56

@gnikixam, I think your cut off 9 characters on every 4th line + performance issue both should be addressed by this one IMHO.

gnikixam 2020-12-17 13:44:40

Yeah that's right. But second aim is, to remove additionally XY characters at the end of this line. For example: line 4 the last 3 characters, line 8 the last 5 characters and so on. This is very time consuming

RavinderSingh13 2020-12-17 13:51:35

@gnikixam, I really thought both are same requirement only :) Do lines where you want to remove characters at last have any specific sequence or logic in their line number? Kindly do let me know.

热门帖子

1

Macbook M1 升级 M3 MAX

2

电信疯狂 QoS 香港的下行？

3

🇭🇰关于香港开户和旅游简单但有用的分享（附白拿 HKD 300 方法）

4

家庭 or 个人用的 NAS 有什么可以推荐的吗？

5

远程兼职 Web3 工程师

6

请假理由填写问题

7

[烟台大樱桃] 人生不只有上班一条路，被裁后决定专心转行做水果

8

[记录] 2024-05-04 清晨

9

这大约是独立开发的顶流了吧， v 友们怎么看

10

6 年软开求职国外务工求职比如澳新美等地

热门github

1

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

2

A Windows and Office activator using HWID / Ohook / KMS38 / Online KMS activation methods, with a focus on open-source code and fewer antivirus detections.

3

Get up and running with Llama 2, Mistral, Gemma, and other large language models.

4

该项目可以让你通过订阅的方式使用Cloudflare WARP+，自动获取流量。This project enables you to use Cloudflare WARP+ through subscription, automatically acquiring traffic.

5

Multi functional app to find duplicates, empty folders, similar images etc.

6

Xray panel supporting multi-protocol multi-user expire day & traffic & ip limit (Vmess & Vless & Trojan & ShadowSocks & Wireguard)

7

The Free Software Media System

8

lightweight, standalone C++ inference engine for Google's Gemma models.

9

📚 Freely available programming books

10

A collective list of free APIs

11

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java

12

🎓 Path to a free self-taught education in Computer Science!

13

Curso para aprender el lenguaje de programación Python desde cero y para principiantes. 75 clases, 37 horas en vídeo, código, proyectos y grupo de chat. Fundamentos, frontend, backend, testing, IA...

14

This repository contains System Design resources which are useful while preparing for interviews and learning Distributed Systems

15

Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of FlashAttention.