Python json parser allow duplicate keys -
i need parse json file unfortunately me, not follow prototype. have 2 issues data, i've found workaround i'll mention @ end, maybe can there well.
so need parse entries this:
"test":{ "entry":{ "type":"something" }, "entry":{ "type":"something_else" } }, ...
the json default parser updates dictionary , therfore uses last entry. have somehow store other 1 well, , have no idea how this. have store keys in several dictionaries in same order appear in file, thats why using ordereddict so. works fine, if there way expand duplicate entries i'd grateful.
my second issue same json file contains entries that:
"test":{ { "type":"something" } }
json.load() function raises exception when reaches line in json file. way worked around manually remove inner brackets myself.
thanks in advance
you can use jsondecoder.object_pairs_hook
customize how jsondecoder
decodes objects. hook function passed list of (key, value)
pairs processing on, , turn dict
.
however, since python dictionaries don't allow duplicate keys (and can't change that), can return pairs unchanged in hook , nested list of (key, value)
pairs when decode json:
from json import jsondecoder def parse_object_pairs(pairs): return pairs data = """ {"foo": {"baz": 42}, "foo": 7} """ decoder = jsondecoder(object_pairs_hook=parse_object_pairs) obj = decoder.decode(data) print obj
output:
[(u'foo', [(u'baz', 42)]), (u'foo', 7)]
how use data structure you. stated above, python dictionaries won't allow duplicate keys, , there's no way around that. how lookup based on key? dct[key]
ambiguous.
so can either implement own logic handle lookup way expect work, or implement sort of collision avoidance make keys unique if they're not, , then create dictionary nested list.
edit: since said modify duplicate key make unique, here's how you'd that:
from collections import ordereddict json import jsondecoder def make_unique(key, dct): counter = 0 unique_key = key while unique_key in dct: counter += 1 unique_key = '{}_{}'.format(key, counter) return unique_key def parse_object_pairs(pairs): dct = ordereddict() key, value in pairs: if key in dct: key = make_unique(key, dct) dct[key] = value return dct data = """ {"foo": {"baz": 42, "baz": 77}, "foo": 7, "foo": 23} """ decoder = jsondecoder(object_pairs_hook=parse_object_pairs) obj = decoder.decode(data) print obj
output:
ordereddict([(u'foo', ordereddict([(u'baz', 42), ('baz_1', 77)])), ('foo_1', 7), ('foo_2', 23)])
the make_unique
function responsible returning collision-free key. in example suffixes key _n
n
incremental counter - adapt needs.
because object_pairs_hook
receives pairs in order appear in json document, it's possible preserve order using ordereddict
, included well.
Comments
Post a Comment