algorithm - Finding unique (as in only occurring once) element haskell -
i need function takes list , return unique element if exists or [] if doesn't. if many unique elements exists should return first 1 (without wasting time find others). additionally know elements in list come (small , known) set a. example function job ints:
unique :: ord => [a] -> [a] unique li = first $ filter ((==1).length) ((group.sort) li) first [] = [] first (x:xs) = x ghci> unique [3,5,6,8,3,9,3,5,6,9,3,5,6,9,1,5,6,8,9,5,6,8,9] ghci> [1]
this not enough because involves sorting (n log n) while done in linear time (because small). additionally requires type of list elements ord while should needed eq. nice if amount of comparisons small possible (ie if traverse list , encounter element el twice don't test subsequent elements equality el)
this why example this: counting unique elements in list doesn't solve problem - answers involve either sorting or traversing whole list find count of elements.
the question is: how correctly , efficiently in haskell ?
okay, linear time, finite domain. running time o((m + d) log d), m size of list , d size of domain, linear when d fixed. plan use elements of set keys of trie, counts values, through trie elements count 1.
import qualified data.inttrie inttrie import data.list (foldl') import control.applicative
count each of elements. traverses list once, builds trie results (o(m log d)), returns function looks result in trie (with running time o(log d)).
counts :: (enum a) => [a] -> (a -> int) counts xs = inttrie.apply (foldl' insert (pure 0) xs) . fromenum insert t x = inttrie.modify' (fromenum x) (+1) t
we use enum
constraint convert values of type a
integers in order index them in trie. enum
instance part of witness of assumption a
small, finite set (bounded
other part, see below).
and ones unique.
uniques :: (eq a, enum a) => [a] -> [a] -> [a] uniques dom xs = filter (\x -> cts x == 1) dom cts = counts xs
this function takes first parameter enumeration of entire domain. have required bounded a
constraint , used [minbound..maxbound]
instead, semantically appealing me since finite enum
+bounded
, quite inflexible since domain needs known @ compile time. choose uglier more flexible variant.
uniques
traverses domain once (lazily, head . uniques dom
traverse far needs to find first unique element -- not in list, in dom
), each element running lookup function have established o(log d), filter takes o(d log d), , building table of counts takes o(m log d). uniques
runs in o((m + d) log d), linear when d fixed. take @ least ω(m log d) information it, because has traverse whole list build table (you have way end of list see if element repeated, can't better this).
Comments
Post a Comment