本文介绍了解析自然语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

开始:我知道这个系统会有缺陷!

To start: I know this system will have flaws!

注意:我添加了一些其他语言,因为我没有找到特定于php的这个问题。一个JavaScript或jquery解决方案可以工作......我可以改变语言...它是我追求的方法!

NOTE: Im adding a few other languages because I don't find this problem specific to php..A JavaScript or jquery solution would work...I could change the language ...Its the method i am after!

什么:我正在尝试解析字符串以确定用户的需求。

What: I am trying to parse a string to determine what the user desires.

这个想法是字符串是从语音生成的

示例1:
打开厨房灯,我的卧室和客厅灯关闭。

Example 1:Turn my kitchen lights on and my bedroom and living room lights off.

示例2:
打开厨房灯,我的卧室灯亮,客厅灯关闭。

Example 2:Turn my kitchen lights on and my bedroom lights on and living room lights off.

示例3:
关闭我的厨房,卧室和客厅的灯。

Example 3:Turn my kitchen and my bedroom and living room lights off.

这是一个过于简化的例子,但请注意我想扩展到这三个房间之外以及仅控制灯光
例如:ceili之外打开粉丝...

This is an overly simplified example but please note that I want to scale beyond these three room as well as just controlling the lightsexample: outside ceiling fan on...

如何:我目前正在使用一些while循环迭代数组并检查某些字符串是否是在数组中。

How: I am currently using a few while loops to iterate through an array and the checking if certain strings are in the array.

更多信息:我的想法是首先拆分和上的字符串。然后我检查每个阵列的开启或关闭。如果它没有打开或关闭我加入数组与下一个。

More how: My idea was to first split on the string on the "and". I then check each array for a on or off. If it does not have a on or off i join the array with the next.

帮助:我想清理这个概念以及看到其他人的想法...我有任何意义..

Help: I would love to clean this concept up as well as see someone else s ideas...I am up for anything ..

谢谢
JT

ThanksJT

代码:

$input = 'kitchen lights on and bed and living lights off';
$output = preg_split( "/ (and) /", $input );
$num = (int)count($output);
$i=0;

while($i<$num){
    if ((strpos($output[$i],'on') !== false)||(strpos($output[$i],'off') !== false)) {}
    elseif(((strpos($output[$i+1],'on') !== false)||(strpos($output[$i+1],'off') !== false))){
    $output[$i+1] .= ' + '.$output[$i];
        unset($output[$i]);

    }

    $i++;
}
$output = array_values($output);
$i=0;
$num = (int)count($output);
echo '<br>';
while($i<$num){
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'on') !== false)&&(strpos($output[$i],'kitchen') !== false)){
echo'kitchen lights on<br>';
}
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'off') !== false)&&(strpos($output[$i],'kitchen') !== false)){
echo'kitchen lights off<br>';
}
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'on') !== false)&&(strpos($output[$i],'living') !== false)){
echo'living lights on<br>';
}
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'off') !== false)&&(strpos($output[$i],'living') !== false)){
echo'living lights off<br>';
}
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'on') !== false)&&(strpos($output[$i],'bed') !== false)){
echo'bed lights on<br>';
}
if ((strpos($output[$i],'lights') !== false)&&(strpos($output[$i],'off') !== false)&&(strpos($output[$i],'bed') !== false)){
echo'bed lights off<br>';
}
$i++;
}

代码试用版2:注意:此处理所有上面的例子!

Code trial 2: Note: This handles all the above examples!

<?php
//works list
$inp[]='turn the lights in the bedroom on';
$inp[]='Turn on the bedroom light';
$inp[]='turn on the lights in the bedroom';
$inp[]='Turn my kitchen and my bedroom and living room lights off.';
$inp[]='Turn the light in the kitchen on and the fan in the bedroom off';
$inp[]='Turn my kitchen lights on and my bedroom and living room lights off';
$inp[]='Turn my kitchen fan and my bedroom lights on and living room lights off.';
$inp[]='Turn my kitchen lights on and my bedroom lights on and living room lights off';
$inp[] = 'kitchen lights on and bath and living lights off';
$inp[] = 'flip on the lights in the living room';
$inp[] = 'turn on all lights';

//does not work list
//$inp[] = 'turn on all lights but living';

foreach ($inp as $input){

$input = trim($input);
$input  = rtrim($input, '.');
$input = trim($input);
$input  = rtrim($input, '.');


$words = explode(" ", $input);

$state = array('and','but','on','off','all','living','bed','bedroom','bath','kitchen','dining','light','lights','fan','tv');
$result = array_intersect($words, $state);
$result = implode(" ", $result);
$result = trim($result);
    //$result = preg_split('/(and|but)/',$input,-1, PREG_SPLIT_DELIM_CAPTURE);
$result = preg_split( "/ (and|but) /",  $result );
    //$result = explode("and", $result);

$sep=array();

foreach($result as $string){
$word = explode(" ", $string);
$sep[]=$word;
}

$test=array();
$num = (int)count($sep);

$i=0;

while($i<($num)){
$result = (int)count(array_intersect($sep[$i], $state));
$j=$i;

    while($result<=3)
    {
        $imp = implode(" ", $sep[$j]);
        if(isset($test[$i])){$test[$i]=$imp.' '.$test[$i];}
        else{$test[$i]=$imp;}

        if ($result>=3){$j++;break;}
        $result = (int)count(array_intersect($sep[++$j], $state));
    }
$i=$j;
}

print_r($test);
    echo '<br>';
}


?>


推荐答案

这当然不是最有效的解决方案,但这里只有一个。你可以肯定地改进它,比如缓存正则表达式,但你明白了。每个子数组中的最后一项是操作。

That's certainly not the most efficient solution, but here's one. You can definitely improve it, like caching regular expressions, but you get the idea. The last item in every sub-array is the operation.

DEMO

var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
    r = s.replace(/^Turn|\s*my/g, '').match(/.+? (on|off)/g).map(function(item) {
        var items = item.trim().replace(/^and\s*/, '').split(/\s*and\s*/),
            last = items.pop().split(' '),
            op = last.pop();
        return items.concat([last.join(' '), op]);
    });

console.log(r);



实际上逻辑很简单,也许太简单了:

The logic is quite simple actually, perhaps too simple:

var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
    r = s
        .replace(/^Turn|\s*my/g, '') //remove noisy words
        .match(/.+? (on|off)/g) //capture all groups of [some things][on|off]
        //for each of those groups, generate a new array from the returned results
        .map(function(item) {
            var items = item.trim()
                    .replace(/^and\s*/, '') //remove and[space] at the beginning of string
                    //split on and to get all things, for instance if we have
                    //test and another test off, we want ['test', 'another test off']
                    .split(/\s*and\s*/),
                //split the last item on spaces, with previous example we would get
                //['another', 'test', 'off']
                last = items.pop().split(' '),
                op = last.pop(); //on/off will always be the last item in the array, pop it
            //items now contains ['test'], concatenate with the array passed as argument
            return items.concat(
                [
                    //last is ['another', 'test'], rejoin it together to give 'another test'
                    last.join(' '),
                    op //this is the operation
                ]
            );
        });

编辑:当我发布答案时,我还没有意识到复杂程度如何灵活你需要这个。我提供的解决方案仅适用于我的示例中的句子,具有可识别的嘈杂单词和特定的命令顺序。对于更复杂的东西,除了创建像@SpaceDog建议的解析器之外别无选择。一旦我有足够的时间,我会尝试拿出一些东西。

At the time I posted the answer, I haven't realized how complex and flexible you needed this to be. The solution I provided would only work for sentences structured as in my example, with identifiable noisy words and a specific command order. For something more complex, you will have no other choice but to create a parser like @SpaceDog suggested. I will try to come up with something as soon as I have enough time.

这篇关于解析自然语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 16:45