Python: List vs Dict for look up table


Question

I have about 10million values that I need to put in some type of look up table, so I was wondering which would be more efficient a list or dict?

I know you can do something like this for both:

if something in dict_of_stuff:
    pass

and

if something in list_of_stuff:
    pass

My thought is the dict will be faster and more efficient.

Thanks for your help.

EDIT 1
Little more info on what I'm trying to do. Euler Problem 92. I'm making a look up table to see if a value calculated has all ready been calculated.

EDIT 2
Efficiency for look up.

EDIT 3
There are no values assosiated with the value...so would a set be better?

1
154
1/22/2015 2:36:26 AM

Accepted Answer

Speed

Lookups in lists are O(n), lookups in dictionaries are amortized O(1), with regard to the number of items in the data structure. If you don't need to associate values, use sets.

Memory

Both dictionaries and sets use hashing and they use much more memory than only for object storage. According to A.M. Kuchling in Beautiful Code, the implementation tries to keep the hash 2/3 full, so you might waste quite some memory.

If you do not add new entries on the fly (which you do, based on your updated question), it might be worthwhile to sort the list and use binary search. This is O(log n), and is likely to be slower for strings, impossible for objects which do not have a natural ordering.

202
2/4/2009 11:44:48 PM

A dict is a hash table, so it is really fast to find the keys. So between dict and list, dict would be faster. But if you don't have a value to associate, it is even better to use a set. It is a hash table, without the "table" part.


EDIT: for your new question, YES, a set would be better. Just create 2 sets, one for sequences ended in 1 and other for the sequences ended in 89. I have sucessfully solved this problem using sets.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon