如何分离的&QUOT字

如何分离的&QUOT字

本文介绍了如何分离的&QUOT字;句子QUOT;用空格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

展望中自动创建的JasperServer域。域是用于创建即席报表数据的说法。列的名称必须以可读的方式psented用户$ P $。

Looking to automate creating Domains in JasperServer. Domains are a "view" of data for creating ad hoc reports. The names of the columns must be presented to the user in a human readable fashion.

有来自该组织在理论上要包括在报表中的数据超过2000件可能。

There are over 2,000 possible pieces of data from which the organization could theoretically want to include on a report. The data are sourced from non-human-friendly names such as:

payperiodma​​tch code
  labordistribution codedesc
  dependentrelationship actionendoption
  actionendoptiondesc地址类型
  addresstypedesc historytype
  psaddresstype角色名
  bankaccountstatus
  bankaccountstatusdesc bankaccounttype
  bankaccounttypedesc beneficiaryamount
  beneficiaryclass beneficiarypercent
  benefitsubclass beneficiaryclass
  beneficiaryclassdesc benefitaction code
  benefitaction codedesc
  benefitagecontrol
  benefitagecontroldesc
  ageconrolagelimit
  ageconrolnoticeperiod

你会如何自动这样的名称更改为:

Question

How would you automatically change such names to:


  • 支付周期匹配code

  • 劳动力分布code递减

  • 的依赖关系


  • 使用谷歌的<一个href=\"http://www.google.co.uk/search?q=caseaction$c$c&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla%3aen-US%3aofficial&client=firefox-a#sclient=psy&hl=en&client=firefox-a&rls=org.mozilla%3aen-US%3Aofficial&q=labordistribution$c$cdesc&aq=f&aqi=&aql=&oq=labordistribution$c$cdesc&gs_rfai=&pbx=1&fp=1&bav=on.2,or.r_gc.r_pw.&cad=b\"相对=nofollow>您是不是要找引擎,但我​​认为它违反了他们的服务条款:

  • Use Google's Did you mean engine, however I think it violates their TOS:

猞猁突降«网址»| grep的你的意思是| AWK ...

任何语言是好的,但文本解析器如Perl很可能会非常适合。 (列名英语只)。

Any language is fine, but text parsers such as Perl would probably be well-suited. (The column names are English-only.)

我们的目标是在不破拆开的话100%完美;下面的结果是可以接受的:

The goal is not 100% perfection in breaking words apart; the following outcome is acceptable:


  • enrollmenteffectivedate - >注册生效日期

  • enrollmentenddate - >登记男人往往日期

  • enrollmentrequirementset - >入学要求设置

无论什么时候,人类就需要仔细检查的结果和纠正很多。削了一组2000结果下降到600编辑将是一个巨大的节省时间。要注视的部分的有多种可能性(例如,therapistname)就是完全错过了点。案件

No matter what, a human will need to double-check the results and correct many. Whittling a set of 2,000 results down to 600 edits would be a dramatic time savings. To fixate on some cases having multiple possibilities (e.g., therapistname) is to miss the point altogether.

推荐答案

有时,的是可以接受的:

Sometimes, bruteforcing is acceptable:

#!/usr/bin/perl

use strict; use warnings;
use File::Slurp;

my $dict_file = '/usr/share/dict/words';

my @identifiers = qw(
    payperiodmatchcode labordistributioncodedesc dependentrelationship
    actionendoption actionendoptiondesc addresstype addresstypedesc
    historytype psaddresstype rolename bankaccountstatus
    bankaccountstatusdesc bankaccounttype bankaccounttypedesc
    beneficiaryamount beneficiaryclass beneficiarypercent benefitsubclass
    beneficiaryclass beneficiaryclassdesc benefitactioncode
    benefitactioncodedesc benefitagecontrol benefitagecontroldesc
    ageconrolagelimit ageconrolnoticeperiod
);

my @mydict = qw( desc );

my $pat = join('|',
    map quotemeta,
    sort { length $b <=> length $a || $a cmp $b }
    grep { 2 < length }
    (@mydict, map { chomp; $_ } read_file $dict_file)
);

my $re = qr/$pat/;

for my $identifier ( @identifiers ) {
    my @stack;
    print "$identifier : ";
    while ( $identifier =~ s/($re)\z// ) {
        unshift @stack, $1;
    }
    # mark suspicious cases
    unshift @stack, '*', $identifier if length $identifier;
    print "@stack\n";
}

输出:

payperiodmatchcode : pay period match code
labordistributioncodedesc : labor distribution code desc
dependentrelationship : dependent relationship
actionendoption : action end option
actionendoptiondesc : action end option desc
addresstype : address type
addresstypedesc : address type desc
historytype : history type
psaddresstype : * ps address type
rolename : role name
bankaccountstatus : bank account status
bankaccountstatusdesc : bank account status desc
bankaccounttype : bank account type
bankaccounttypedesc : bank account type desc
beneficiaryamount : beneficiary amount
beneficiaryclass : beneficiary class
beneficiarypercent : beneficiary percent
benefitsubclass : benefit subclass
beneficiaryclass : beneficiary class
beneficiaryclassdesc : beneficiary class desc
benefitactioncode : benefit action code
benefitactioncodedesc : benefit action code desc
benefitagecontrol : benefit age control
benefitagecontroldesc : benefit age control desc
ageconrolagelimit : * ageconrol age limit
ageconrolnoticeperiod : * ageconrol notice period

又见的一大壮举拼写检查。

这篇关于如何分离的&QUOT字;句子QUOT;用空格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-16 02:16