问题描述
假设我想以简单的匹配方式找到区分两个类的一组特征/属性,我可以在 prolog 中使用 clpfd 来做到这一点吗?
c_s_mining(Features,Value):-特征 = [F1,F2,F3,F4],功能 ins 0..1,示例A = [A1,A2,A3,A4],示例 B =[B1,B2,B3,B4],示例 C =[C1,C2,C3,C4],A1#=0, A2#=1,A3#=0,A4#=1,B1#=0, B2#=1,B3#=0,B4#=1,C1#=1, C2#=0,C3#=0,C4#=1,示例 D =[D1,D2,D3,D4],例子E =[E1,E2,E3,E4],示例 Q =[Q1,Q2,Q3,Q4],D1#=1,D2#=0,D3#=1,D4#=0,E1#=1,E2#=0,E3#=1,E4#=0,Q1#=0,Q2#=1,Q3#=1,Q4#=0,正数 =[ExampleA,ExampleB,ExampleC],负数 = [ExampleD,ExampleE,ExampleQ],0..sup 中的 TP,0..sup 中的 FP,封面(功能,正面,TP),封面(功能,底片,FP),inf..sup 中的值,值#= TP-FP.封面(功能,示例,Number_covered):-findall(*,(member(E,Examples),E=Features),Covers), length(Covers,Number_covered).
每个例子由四个二元特征描述,有三个正例(A、B、C)和三个反例(D、E、Q).
一个示例被一组选定的特征覆盖,如果它们匹配.例如,如果 Features
与 [0,1,0,1]
统一,那么这将匹配两个正数和 0 个负数.
我将 Value
设置为等于 TP
(真阳性) - TN
(真阴性).我想最大化价值并找到相应的特征集.
我查询?-c_s_mining(Features,Value),labelling([max(Value)],[Value]).
我期望的答案是:Features =[0,1,0,1], Value =2
但我得到 Features =[_G1,_G2,_G3,G4],Value =0,G1在0..1,G2在0..1,G3在0..1,G4在0..1.
CLP(FD) 约束的具体化
要推理什么匹配,什么不匹配,请使用约束具体化:它允许您将约束的真值反映到表示布尔值的 CLP(FD) 变量中.>
您可以使用这些值进行算术运算来表示匹配示例的数量等.
例如,在你的情况下,你可以写:
:- use_module(library(clpfd)).c_s_mining(特征,价值):-示例A = [0,1,0,1],示例B = [0,1,0,1],示例C = [1,0,0,1],示例 D = [1,0,1,0],示例E = [1,0,1,0],ExampleQ = [0,1,1,0],same_length(特征,ExampleA),功能 ins 0..1,正数 = [ExampleA,ExampleB,ExampleC],负数 = [ExampleD,ExampleE,ExampleQ],封面数量(功能,正面,TP),封面数量(功能,底片,FP),值#= TP-FP.封面编号(功能,示例,编号):-地图列表(封面_(特征),例子,数字),总和(数字,#=,数字).覆盖_([F1,F2,F3,F4],[E1,E2,E3,E4],覆盖): -覆盖#(F1#=E1#/\F2#=E2#/\F3#=E3#/\F4#=E4).
然后使用labeling/2
的优化选项先得到最大值:
另请注意,我删除了一些多余的约束,例如 Value in inf..sup
,因为约束求解器可以自行解决.
CLP(B):布尔约束的声明性替代方案
对于此类布尔模式的情况,还可以查看 CLP(B):基于布尔 变量的约束逻辑编程,例如在 SICStus Prolog 和 SWI 中可用.使用 CLP(B) 需要您以稍微不同的方式制定搜索,因为它缺少 CLP(FD) 的强大标记选项.但是,与 CLP(FD) 相比,CLP(B) 是完整的,并且可以更早地检测到不一致和隐含的约束.
在下面的代码中,我使用 CLP(FD) 来指导搜索最佳值,然后使用 CLP(B) 来说明实际约束.labeling/1
的最后调用(注意这是来自 library(clpb)
,不要与 CLP(FD) 的 labeling/2
) 用于确保所有 CLP(B) 变量的接地值.在它出现的时候,它在某种意义上只是一种形式:我们已经知道在这一点上有一个解决方案,这要归功于 CLP(B) 的完整性.
:- use_module(library(clpb)).:- use_module(library(clpfd)).c_s_mining(特征,价值):-示例A = [0,1,0,1],示例B = [0,1,0,1],示例C = [1,0,0,1],示例 D = [1,0,1,0],示例E = [1,0,1,0],ExampleQ = [0,1,1,0],same_length(特征,ExampleA),正数 = [ExampleA,ExampleB,ExampleC],负数 = [ExampleD,ExampleE,ExampleQ],[TP,FP] ins 0..3, %(在这种情况下)值#= TP-FP,标签([max(Value)], [TP,FP]),封面数量(功能,正面,TP),封面数量(功能,底片,FP),标签(特征).封面编号(功能,示例,编号):-地图列表(封面_(特征),例子,数字),sat(card([Number], Numbers)).覆盖_([F1,F2,F3,F4],[E1,E2,E3,E4],覆盖): -sat(Covered =:= ((F1=:=E1)*(F2=:=E2)*(F3=:=E3)*(F4=:=E4))).
Say I want to find the set of features/attributes that differentiate two classes in a simple matching manner can I use clpfd in prolog to do this?
c_s_mining(Features,Value):-
Features = [F1,F2,F3,F4],
Features ins 0..1,
ExampleA = [A1,A2,A3,A4],
ExampleB =[B1,B2,B3,B4],
ExampleC =[C1,C2,C3,C4],
A1 #=0, A2#=1,A3#=0,A4#=1,
B1 #=0, B2#=1,B3#=0,B4#=1,
C1 #=1, C2#=0,C3#=0,C4#=1,
ExampleD =[D1,D2,D3,D4],
ExampleE =[E1,E2,E3,E4],
ExampleQ =[Q1,Q2,Q3,Q4],
D1#=1,D2#=0,D3#=1,D4#=0,
E1#=1,E2#=0,E3#=1,E4#=0,
Q1#=0,Q2#=1,Q3#=1,Q4#=0,
Positives =[ExampleA,ExampleB,ExampleC],
Negatives = [ExampleD,ExampleE,ExampleQ],
TP in 0..sup,
FP in 0..sup,
covers(Features,Positives,TP),
covers(Features,Negatives,FP),
Value in inf..sup,
Value #= TP-FP.
covers(Features,Examples,Number_covered):-
findall(*,(member(E,Examples),E=Features),Covers), length(Covers,Number_covered).
Each example is described by four binary features, and there are three positive examples (A,B,C) and three negative examples (D,E,Q).
An example is covered by a set of selected features if they match.So for example if Features
is unified with [0,1,0,1]
, then this will match two positives and 0 negatives.
I set Value
to be equal to TP
(true positives) - TN
(true negatives). I want to maximise Value and find the corresponding set of features.
I query ?-c_s_mining(Features,Value),labelling([max(Value)],[Value]).
The answer I expect is: Features =[0,1,0,1], Value =2
but I get Features =[_G1,_G2,_G3,G4],Value =0, G1 in 0..1, G2 in 0..1, G3 in 0..1, G4 in 0..1.
Reification of CLP(FD) constraints
To reason about what is matched and what is not, use constraint reification: It allows you to reflect the truth value of a constraint into a CLP(FD) variable denoting a Boolean value.
You can perform arithmetic with such values to denote the number of matched examples etc.
For example, in your case, you can write:
:- use_module(library(clpfd)).
c_s_mining(Features, Value) :-
ExampleA = [0,1,0,1],
ExampleB = [0,1,0,1],
ExampleC = [1,0,0,1],
ExampleD = [1,0,1,0],
ExampleE = [1,0,1,0],
ExampleQ = [0,1,1,0],
same_length(Features, ExampleA),
Features ins 0..1,
Positives = [ExampleA,ExampleB,ExampleC],
Negatives = [ExampleD,ExampleE,ExampleQ],
covers_number(Features, Positives, TP),
covers_number(Features, Negatives, FP),
Value #= TP-FP.
covers_number(Features, Examples, Number):-
maplist(covers_(Features), Examples, Numbers),
sum(Numbers, #=, Number).
covers_([F1,F2,F3,F4], [E1,E2,E3,E4], Covered) :-
Covered #<==> (F1#=E1 #/\ F2#=E2 #/\ F3#=E3 #/\ F4#=E4).
And then use the optimisation options of labeling/2
to get largest values first:
?- c_s_mining(Fs, Value), labeling([max(Value)], Fs). Fs = [0, 1, 0, 1], Value = 2 ; Fs = [1, 0, 0, 1], Value = 1 ; Fs = [0, 0, 0, 0], Value = 0 ; etc.
Notice also that I have removed some superfluous constraints, such as Value in inf..sup
, since the constraint solver can figure them out on its own.
CLP(B): A declarative alternative for Boolean constraints
For the case of such Boolean patterns, also check out CLP(B): Constraint Logic Programming over Boolean variables, available for example in SICStus Prolog and SWI. Using CLP(B) requires you formulate the search a bit differently, since it lacks the powerful labeling options of CLP(FD). However, in contrast to CLP(FD), CLP(B) is complete and may detect inconsistencies as well as entailed constraints much earlier.
In the following code, I am using CLP(FD) to guide the search for optimal values, and then use CLP(B) to state the actual constraints. A final call of labeling/1
(note that this is from library(clpb)
, not to be confused with CLP(FD)'s labeling/2
) is used to ensure ground values for all CLP(B) variables. At the point it appears, it is only a formality in some sense: We already know that there is a solution at this point, thanks to CLP(B)'s completeness.
:- use_module(library(clpb)).
:- use_module(library(clpfd)).
c_s_mining(Features, Value):-
ExampleA = [0,1,0,1],
ExampleB = [0,1,0,1],
ExampleC = [1,0,0,1],
ExampleD = [1,0,1,0],
ExampleE = [1,0,1,0],
ExampleQ = [0,1,1,0],
same_length(Features, ExampleA),
Positives = [ExampleA,ExampleB,ExampleC],
Negatives = [ExampleD,ExampleE,ExampleQ],
[TP,FP] ins 0..3, % (in this case)
Value #= TP-FP,
labeling([max(Value)], [TP,FP]),
covers_number(Features, Positives, TP),
covers_number(Features, Negatives, FP),
labeling(Features).
covers_number(Features, Examples, Number):-
maplist(covers_(Features), Examples, Numbers),
sat(card([Number], Numbers)).
covers_([F1,F2,F3,F4], [E1,E2,E3,E4], Covered) :-
sat(Covered =:= ((F1=:=E1)*(F2=:=E2)*(F3=:=E3)*(F4=:=E4))).
这篇关于你能用 clpfd 来实现一个覆盖算法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!