问题描述
我有一个文件,该文件有两列,X和Y为正数,并且没有网格数据点(> 10 ^ 5点).
I have a file with two columns, X and Y positive, and non-gridded, data points (> 10^5 points).
1 0.9
0.9 1.1
0.5 1.25
2.6 0.9
3.1 2.6
2.9 2.55
4.1 0.9
1.2 6
5.5 2.5
6 4
4 7.2
. .
. .
我想在这些点的选定范围内生成X-Y网格(大小为binsize
).此外,我想添加第三列,以指示网格的每个顶点的正方形区域(binsize
x binsize
)中包含的原始数据点的数量.
I would like to generate an X-Y grid (of size binsize
) in a selected range of those points. Besides, I would like to add a third column indicating the count of the original data points contained in a square area (binsize
x binsize
) of each of the vertices of the grid.
如果binsize=5
2.5 2.5 7
2.5 7.5 2
7.5 2.5 2
. . .
. . .
我想将数据范围和binsize
传递给AWK程序.
I would like to pass to the AWK program the range of data and the binsize
.
非常感谢您的帮助.
编辑:
binsize用于确定我必须计算XY数据点的值的范围.范围输入用于选择要计数的x和y值,例如,如果我在[0,5]
中选择x,在[0,5]
中选择y,则仅考虑前八个xy点的装箱.我的真实数据集很大
The binsize is to determine the range of values in which I have to count the XY datapoints. The range input is to select the x and y values to count, for example, If I select x in [0,5]
and y in [0,5]
then I only will consider the binning of the first eight xy points. My real dataset is very big
推荐答案
我认为解决方案可能如下所示:
I think a solution could look something like this:
awk -v binsize=0.5 -v xmin=0 -v xmax=3 -v ymin=2 -v ymax=4 '
BEGIN {
# Calculate number of x-bins and number of y-bins
nx=int((xmax-xmin)/binsize)
ny=int((ymax-ymin)/binsize)
# Pre-zero all bins else empty entries will not show up in output
for(x=0;x<nx;x++){
for(y=0;y<ny;y++){
output[x,y]=0
}
}
}
{
# pick up x and y
x=$1; y=$2
# if this sample within x-range and y-range
if(x>=xmin && x<=xmax && y>=ymin && y<=ymax){
xindex=int((x-xmin)/binsize)
yindex=int((y-ymin)/binsize)
output[xindex,yindex]++;
printf("DEBUG: x=%f, y=%f (line %d)\n",x,y,NR);
printf("DEBUG: Incrementing bin [%d][%d]\n",xindex,yindex);
}
}
END{
# Print results
for(x=0;x<nx;x++){
for(y=0;y<ny;y++){
printf("%d\t",output[x,y]);
}
printf("\n");
}
} ' points.txt
并以此作为输入:
0.4 2.1
0.39 2.02
0.1 2.4
1 0.9
0.9 1.1
0.5 1.25
2.6 0.9
3.1 2.6
2.9 2.55
您将得到此输出:
DEBUG: x=0.400000, y=2.100000 (line 1)
DEBUG: Incrementing bin [0][0]
DEBUG: x=0.390000, y=2.020000 (line 2)
DEBUG: Incrementing bin [0][0]
DEBUG: x=0.100000, y=2.400000 (line 3)
DEBUG: Incrementing bin [0][0]
DEBUG: x=2.900000, y=2.550000 (line 9)
DEBUG: Incrementing bin [5][1]
3 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 1 0 0
这篇关于计算2D仓中的点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!