用随机数替换文件中的重复数

本文介绍了用随机数替换文件中的重复数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用"sed"在文件的每一行中将所有出现的数字替换为随机数.例如，如果我的文件的每一行都有数字892，那么我想用800到900之间的唯一随机数替换它.

I want to replace all occurrences of a number with a random number in each line of a file using "sed".For example, if my file has the number 892 in each line, I would like to replace that with a unique random number between 800 and 900.

输入文件:-

temp11;djaxfile11;892
temp12;djaxfile11;892
temp13;djaxfile11;892
temp14;djaxfile11;892
temp15;djaxfile11;892

预期的输出文件:-

temp11;djaxfile11;805
temp12;djaxfile11;846
temp13;djaxfile11;833
temp14;djaxfile11;881
temp15;djaxfile11;810

我正在尝试以下操作:-

I am trying the below:-

sed -i -- "s/;892/;`echo $RANDOM % 100 + 800 | bc`/g" file.txt

但是它用800到900之间的单个随机数代替了所有892.

but it is replacing all the occurrences of 892 with a single random number between 800 and 900.

输出文件:-

temp11;djaxfile11;821
temp12;djaxfile11;821
temp13;djaxfile11;821
temp14;djaxfile11;821
temp15;djaxfile11;821

能帮我更正我的代码吗?预先感谢.

Could you please help in correcting my code ? Thanks in advance.

推荐答案

使用GNU sed，您可以执行类似的操作

With GNU sed, you could do something like

sed '/;892$/ { h; s/.*/echo $((RANDOM % 100 + 800))/e; x; G; s/892\n// }' filename

...但是用awk这样做会更明智:

...but it would be much saner to do it with awk:

awk -F \; 'BEGIN { OFS = FS } $NF == 892 { $NF = int(rand() * 100 + 800) } 1' filename

要确保随机数是唯一的，请如下修改awk代码:

To make sure that the random numbers are unique, amend the awk code as follows:

awk -F \; 'BEGIN { OFS = FS } $NF == 892 { do { $NF = int(rand() * 100 + 800) } while(!seen[$NF]++) } 1'

用sed这样做对我来说太疯狂了.请注意，这仅在文件中的最后字段为892的行少于100行时有效.

Doing that with sed would be too crazy for me. Be aware that this will only work only if there are less than 100 lines with a last field of 892 in the file.

sed代码读取

/;892$/ {                              # if a line ends with ;892
  h                                    # copy it to the hold buffer
  s/.*/echo $((RANDOM % 100 + 800))/e  # replace the pattern space with the
                                       # output of echo $((...))
                                       # Note: this is a GNU extension
  x                                    # swap pattern space and hold buffer
  G                                    # append the hold buffer to the PS
                                       # the PS now contains line\nrandom number
  s/892\n//                            # remove the old field and the newline
}

awk代码更加简单.使用 -F \; ，我们告诉awk以分号分隔行，然后

The awk code is much more straightforward. With -F \;, we tell awk to split the lines at semicolons, then

BEGIN { OFS = FS }  # output field separator is input FS, so the output
                    # is also semicolon-separated
$NF == 892 {        # if the last field is 892
                    # replace it with a random number
  $NF = int(rand() * 100 + 800)
}
1                   # print.

经修改的awk代码替换

The amended awk code replaces

$NF = int(rand() * 100 + 800)

与

do {
  $NF = int(rand() * 100 + 800)
} while(!seen[$NF]++)

...换句话说，它保留一张已经使用过的随机数表，并保留绘图数，直到得到一个以前从未见过的数字.

...in other words, it keeps a table of random numbers it has already used and keeps drawing numbers until it gets one it hasn't seen before.

这篇关于用随机数替换文件中的重复数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！