I have two lists in Python, like these:
temp1 = ['One', 'Two', 'Three', 'Four'] temp2 = ['One', 'Two']
I need to create a third list with items from the first list which aren't present in the second one. From the example I have to get:
temp3 = ['Three', 'Four']
Are there any fast ways without cycles and checking?
In : list(set(temp1) - set(temp2)) Out: ['Four', 'Three']
In : set([1, 2]) - set([2, 3]) Out: set()
where you might expect/want it to equal
set([1, 3]). If you do want
set([1, 3]) as your answer, you'll need to use
set([1, 2]).symmetric_difference(set([2, 3])).
The existing solutions all offer either one or the other of:
But so far no solution has both. If you want both, try this:
s = set(temp2) temp3 = [x for x in temp1 if x not in s]
import timeit init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]' print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000) print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000) print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)
4.34620224079 # ars' answer 4.2770634955 # This answer 30.7715615392 # matt b's answer
The method I presented as well as preserving order is also (slightly) faster than the set subtraction because it doesn't require construction of an unnecessary set. The performance difference would be more noticable if the first list is considerably longer than the second and if hashing is expensive. Here's a second test demonstrating this:
init = ''' temp1 = [str(i) for i in range(100000)] temp2 = [str(i * 2) for i in range(50)] '''
11.3836875916 # ars' answer 3.63890368748 # this answer (3 times faster!) 37.7445402279 # matt b's answer