Python - verifying if one list is a subset of the other


I need to verify if a list is a subset of another - a boolean return is all I seek.
Is testing equality on the smaller list after an intersection the fastest way to do this?
Performance is of utmost importance given the amount of datasets that need to be compared.
Adding further facts based on discussions:

  1. Will either of the lists be the same for many tests?
    It does as one of them is a static lookup table
  2. Does it need to be a list?
    It does not - the static lookup table can be anything that performs best.
    The dynamic one is a dict from which we extract the keys to perform a static lookup on.

What would be the optimum solution given the scenario?

12/11/2014 9:02:47 AM

Accepted Answer

The performant function Python provides for this is set.issubset. It does have a few restrictions that make it unclear if it's the answer to your question, however.

A list may contain items multiple times and has a specific order. A set does not. To achieve high performance sets work on hashable objects only.

Are you asking about subset or subsequence (which means you'll want a string search algorithm)? Will either of the lists be the same for many tests? What are the datatypes contained in the list? And for that matter, does it need to be a list?

Your other post intersect a dict and list made the types clearer and did get a recommendation to use dictionary key views for their set-like funcitonality. In that case it was known to work because dictionary keys behave like a set (so much so that before we had sets in Python we used dictionaries). One wonders how the issue got less specific in three hours.

5/23/2017 11:47:26 AM

>>> a = [1, 3, 5]
>>> b = [1, 3, 5, 8]
>>> c = [3, 5, 9]
>>> set(a) <= set(b)
>>> set(c) <= set(b)

>>> a = ['yes', 'no', 'hmm']
>>> b = ['yes', 'no', 'hmm', 'well']
>>> c = ['sorry', 'no', 'hmm']
>>> set(a) <= set(b)
>>> set(c) <= set(b)

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow