在python关联列表中有效查找元素

本文介绍了在python关联列表中有效查找元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一组看起来像这样的列表:

I have a set of lists that look like this:

conditions = [
["condition1", ["sample1", "sample2", "sample3"]],
["condition2", ["sample4", "sample5", "sample6"],
...]

如何在Python中高效且优雅地完成以下操作?

how can I do the following things efficiently and elegantly in Python?

查找特定条件下的所有元素?

Find all the elements in a certain condition?

例如获取条件2中的所有样本.现在我可以做:

e.g. get all the samples in condition2. Right now I can do:

for cond in conditions:
  cond_name, samples = cond
  if cond_name == requested_cond:
    return samples

但这很笨拙.

找到条件列表的有序联合?例如. ordered_union(["condition1", "condition2"], conditions)应该返回:

Find the ordered union of a list of conditions? E.g. ordered_union(["condition1", "condition2"], conditions) should return:

["sample1", "sample2", "sample3", "sample4", "sample5", "sample6"]

如何在Python中有效地做到这一点?可能有一个聪明的班轮?

How can I do this efficiently in Python? There are probably clever one liners?

推荐答案

好吧，如果您被迫保留笨拙的数据结构，那就别指望了.与您的第一个解决方案类似的单行代码将是这样的:

Ah well, if you're forced to keep that clunky data structure, you can't expect much. The one-liner equivalent of your first solution is going to be something like:

def samplesof(requested_cond, conditions):
    return next(s for c, s in conditions if c==requested_cond)

对于第二个，如果您坚持使用单线，它将是这样的:

and for the second one, if you insist on one-liners, it's going to be something like:

def ordered_union(the_conds, conditions):
    return [s for c in the_conds for s in samplesof(c, conditions)]

有更快的方法可以解决第二个问题，但是它们都是多行的，例如:

There are faster ways to solve the second problem, but they're all multi-line, e.g.:

aux_set = set(the_conds)
samples_by_cond = dict((c, s) for c, s in conditions if c in aux_set)
return [s for c in the_conds for s in samples_by_cond[c]]

请注意，后一种方法更快的原因在于它使用了正确的数据结构(集合和字典)-不幸的是，它必须自己构建它们，因为传入的conditions嵌套列表确实是错误的数据结构.

Note that the key to the reason this latter approach is faster is that it uses the right data structures (a set and a dict) -- unfortunately it has to build them itself, because the incoming conditions nested list is really the wrong data structure.

您是否不能将conditions封装为仅一次构建关键(正确，快速)辅助数据结构的类的成员变量?例如:

Couldn't you encapsulate conditions as a member variable of a class that builds the crucial (right, fast) auxiliary data structures just once? E.g.:

class Sensible(object):
  def __init__(self, conditions):
    self.seq = []
    self.dic = {}
    for c, s in conditions:
      self.seq.append(c)
      self.dic[c] = s
  def samplesof(self, requested_condition):
    return self.dic[requested_condition]
  def ordered_union(self, the_conds):
    return [s for c in the_conds for s in self.dic[c]]

现在那又快又优雅！

我假设您需要self.seq(条件序列)做其他事情(您提到的两个操作当然不需要！)，并且该序列和样本中没有重复(不管您的实际规格是什么，它们都不难适应，但是当您一无所获时，盲目地尝试猜测它们将会非常困难和毫无意义；-).

I'm assuming that you need self.seq (the sequence of conditions) for something else (it's certainly not needed for the two operations you mention!), and that there are no repetitions in that sequence and in the samples (whatever your actual specs are they won't be hard to accomodate, but blindly trying to guess them when you mention nothing about them would be very hard and pointless;-).

这篇关于在python关联列表中有效查找元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！