





 New connection: ( [session: e696835c]
    2016-04-29 21:13:59+0000 [SSHService ssh-userauth on HoneyPotTransport,3,] login attempt [user1/test123] failed
    2016-04-29 21:14:10+0000 [SSHService ssh-userauth on HoneyPotTransport,3,] login attempt [user1/test1234] failed
    2016-04-29 21:14:13+0000 [SSHService ssh-userauth on HoneyPotTransport,3,] login attempt [user1/test123] failed


I want to output to file a result like this:


"Occurrences"变量将代表文件中记录的登录详细信息[用户名和密码]组合的次数.可以看到User1 test123从同一IP记录了两次.我怎样才能做到这一点?目前,我有两个while循环,并且在第一个while循环中调用了一个子例程,如下所示:

The "Occurrences" variable will represent the number of times a combination of login details[username and password] that have been recorded in the file. User1 test123 can be seen recorded two times from the same IP. How can I do this? I have two while loops at the moment and a subroutine being called inside the first while loop like so:


sub counter(){

        $result = 0;
        #open(FILE2, $cowrie) or die "Can't open '$cowrie': $!";
        while(my $otherlines = <LOG2>){

                if($otherlines =~ /login attempt/){
                        ($user, $password) = (split /[\s:\[\]\/]+/, $otherlines)[-3,-2];
                        if($_[1] =~ /$user/ && $_[2] =~ /$password/){
                        }#if ip matches i think i have to do this with split

                        #print "TEST\n";
        #print "Combo $_[0] and $_[1]\n";

        #print "$result";
        return $result;


sub cowrieExtractor(){

        open(FILE2, $cowrie) or die "Can't open '$cowrie': $!";

        open(LOG2, $path2) or die "Can't open '$path2': $!";

        $seperator = chr(42);
        #To output user and password of login attempt, set $ip variable to the contents of array at that x position of new
        #connection to match the ip of the login attempt
        print FILE2 "SourcePort"."$seperator".

        $ip = "";
        $port = "";
        $usr = "";
        $pass = "";
        $status = "";
        $frequency = 0;

        #Given this is a user/pass attempt honeypot logger, I will use a wide character to reduce the possibility of stopping
        #the WEKA CSV loader from functioning by using smileyface as seperators.

        while(my $lines = <LOG2>){

                if($lines =~ /New connection/){

                ($ip, $port) = (split /[\[\]\s:()]+/, $lines)[7,8];

                if($lines =~ /login attempt/){#and the ip of the new connection
if($lines =~ /$ip/){
                ($usr, $pass, $status) = (split /[\s:\[\]\/]+/, $lines)[-3,-2,-1];

                        $frequency = counter($ip, $usr, $pass);

                        #print $frequency;
                        if($ip && $port && $usr && $pass && $status ne ""){
                                print FILE2 join "$seperator",($port, $status, $frequency, $end);
                                print FILE2 "\n";




Right now in output under Occurrences in output I am getting a 0 and when I tested it appears to be coming from what I initialize the variable $result in the subroutine. i.e. 0; meaning that the if statement inside the subroutine is not working properly. Any help?



Here is a basic way to get expected output. Questions about the context (purpose) remain.

use warnings;
use strict;

my $file = 'logfile.txt';
open my $fh_in, '<', $file;

# Assemble results for required output in data structure:
# %rept = { $port => { $usr => { $status => $freq } };

my %rept;
my ($ip, $port);

while (my $line = <$fh_in>)
    if ($line =~ /New connection/) {
        ($ip, $port) = $line =~ /New connection:\s+([^:]+):(\d+)/;

    my ($usr, $status) =  $line =~ m/login\ attempt \s+ \[ ( [^\]]+ ) \] \s+ (\w+)/x;
    if ($usr and $status) {
    else { warn "Line with an unexpected format:\n$line" }

# use Data::Dumper;
# print Dumper \%rept;

print "Port,Status,Occurences\n";
foreach my $port (sort keys %rept) {
    foreach my $usr (sort keys %{$rept{$port}}) {
        foreach my $stat ( sort keys %{$rept{$port}{$usr}} ) {
            print "$port,$stat,$rept{$port}{$usr}{$stat}\n";



With your input copied into a file logfile.txt this prints



I take the whole user1/test123 (etc) to identify the user. This can be changed in the regex as needed.Note that this will not allow you to query or organize data very differently, it mostly pulls what is needed for the required output. Please let me know if explanations are needed.


首先,我强烈建议您很好地阅读许多可用材料中的一些.一个好的开始肯定是 Perl上的标准教程引用,以及各种食谱在 Perl数据结构上.

First, I strongly recommend a good reading of some of the many materials available.A good start is surely the standard tutorial on Perlreferences, as well as a cookbook of sortson Perl data structures.


The hash used to collect data has keys which are port numbers, and each of them hasfor its value a hash reference (or, rather, an anonymous hash). Each of thesehashes has keys which are users, which for their values have, again, hash references.The keys for these are the possible values of status, so there are two keys (failedand succeded). Their values are frequencies. This kind of 'nesting' is a complexdata structure. There is another important thing. The first time the statement$rept{$port}{$usr}{$status}++ is seen the whole hierarchy is created. So the key$port did not need to exist beforehand. Importantly, this auto vivificationhappens even if a structure is merely queried for values (unless it actually existsalready).


%rept = { '64400' => { 'user1/test123' => { 'failed' => 1 } } }

在第二次迭代中,可以看到相同的端口,但有一个新用户,因此将新数据添加到第二级匿名哈希中.使用status => count创建具有新用户的密钥,其值是(新)匿名哈希.整个哈希为:

In the second iteration the same port is seen but a new user, so new data is added to the second-level anonymous hash. The key with the new user is created, with its value being a (new) anonymous hash, with status => count. The whole hash is:

%rept = {
    '64400' => {
        'user1/test123'  => { 'failed' => 1 },
        'user1/test1234' => { 'failed' => 1 },


In the next iteration the same port is seen and one of already existing users, andas it happens with the status (failed) which also exists. Thus the count for thatstatus is incremented.

例如,可以使用 Data :: Dumper 包.上面代码中注释掉的行会产生

The whole strucure can handily be seen using, for example, theData::Dumper package.The commented out lines in the code above would produce

$VAR1 = {
    '64400' => {
        'user1/test123' => {
                                'failed' => 2
        'user1/test1234' => {
                                'failed' => 1


As we keep processing lines new keys are added as needed (ports, users, status) with the full hierarchy down to the count (of 1 the first time), or, if an existing is encountered, its count is incremented. The generated data structure can be traversed and used as seen in the code, for example. Please also see the plentiful documentation for more on that.



09-06 11:09