Warm tip: This article is reproduced from serverfault.com, please click

matrix-将每一栏除以最大值/最后值

(matrix - divide each column by max value/last value)

发布于 2020-11-30 05:24:10

我有一个像这样的矩阵:

A   25  27  50

B   35  37  475

C   75  78  80

D   99  88  76

0   234 230 681

最后一行是该列中所有元素的总和-也是最大值。

我想要得到的是矩阵,其中每个值都除以该列中的最后一个值(例如,对于第2列中的第一个数字,我希望使用“ 25/234 =“):

A   0.106837606837607   0.117391304347826   0.073421439060206

B   0.14957264957265    0.160869565217391   0.697503671071953

C   0.320512820512821   0.339130434782609   0.117474302496329

D   0.423076923076923   0.382608695652174   0.11160058737151

在另一个线程中给出的答案可以为一个列提供可接受的结果,但是我无法在所有列上循环它。

$ awk 'FNR==NR{max=($2+0>max)?$2:max;next} {print $1,$2/max}' file file

(此处提供了此答案: 使用该列的最大值规范化列数据

我将不胜感激!

Questioner
steff2j
Viewed
11
RavinderSingh13 2020-11-30 14:04:09

第一种解决方案:你可以尝试对GNU中显示的示例进行以下跟踪,编写和测试awk根据OP显示的示例,具有精确的15个浮点数:

awk -v lines=$(wc -l < Input_file) '
FNR==NR{
  if(FNR==lines){
    for(i=2;i<=NF;i++){ arr[i]=$i }
  }
  next
}
FNR<lines{
  for(i=2;i<=NF;i++){ $i=sprintf("%0.15f",(arr[i]?$i/arr[i]:"NaN")) }
  print
}
' Input_file  Input_file

第二种解决方案:如果你不希望浮点数成为特定点,请尝试执行以下操作。

awk -v lines=$(wc -l < Input_file) '
FNR==NR && FNR==lines{
  for(i=2;i<=NF;i++){ arr[i]=$i }
  next
}
FNR<lines && FNR!=NR{
  for(i=2;i<=NF;i++){ $i=(arr[i]?$i/arr[i]:"NaN") }
  print
}
' Input_file Input_file

OR(FNR==lines内部FNR==NR条件的放置条件):

awk -v lines=$(wc -l < Input_file) '
FNR==NR{
  if(FNR==lines){
    for(i=2;i<=NF;i++){ arr[i]=$i }
  }
  next
}
FNR<lines{
  for(i=2;i<=NF;i++){ $i=(arr[i]?$i/arr[i]:"NaN") }
  print
}
' Input_file  Input_file

说明:在上面添加了详细说明。

awk -v lines=$(wc -l < Input_file) '         ##Starting awk program from here, creating lines which variable which has total number of lines in Input_file here.
FNR==NR{                                     ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
  if(FNR==lines){                            ##Checking if FNR is equal to lines then do following.
    for(i=2;i<=NF;i++){ arr[i]=$i }          ##Traversing through all fields here of current line and creating an array arr with index of i and value of current field value.
  }
  next                                       ##next will skip all further statements from here.
}
FNR<lines{                                   ##Checking condition if current line number is lesser than lines, this will execute when 2nd time Input_file is being read.
  for(i=2;i<=NF;i++){ $i=sprintf("%0.15f",(arr[i]?$i/arr[i]:"NaN")) } ##Traversing through all fields here and saving value of divide of current field with arr current field value with 15 floating points into current field.
  print                                      ##Printing current line here.
}
' Input_file  Input_file                     ##Mentioning Input_file names here.