# Algorithm to select sets of objects while maximizing number of objects covered

by Zohaib   Last Updated January 12, 2018 16:05 PM

If we have different objects,

``````[A1, A2, A3, B1, B2, B3, B4, B5]
``````

Some calculations will be performed to find compatible objects. For example, lets assume following 3 sets were formed and every set contains compatible objects:

1. {A1, B2}
2. {A3, B2}
3. {A1, A3, B4, B5}

Now we need to perform filtering. One node can participate only once. For example, if `B2` has made a pair with `A1`, then `B2` can not participate in any other set. Which means, if we select `set 1`, then `set 2` will be deleted because `B2` has already participated. And set 3 will be deleted because A1 has already participated.

Now we need to do filtering such that we maximize number of objects being utilized. If we select `set 1`, then we are only utilizing `A1 and B2`.
Thus, optimal way would be to select `set 3` which is utilizing 4 objects.

Right now, i have a complex function which goes through list of sets recursively and keeps on adjusting until no changes are made to existing sets. This is not only inefficient, but it might not work in cases where changing sets can cause ripple effect.

I am not looking for coded solution, but just a guidance that is there any graph algorithm which I should study?

Tags :

Your problem seems like a combination of maximum coverage and set packing problems. So my guess is that algorithms for them should be helpful. A greedy heuristic would be to choose the set that contains the largest number of uncovered elements over the conflicts with other sets.Another approach would be to use the integer linear program formulation of max coverage and add a constraint where you only accept solutions forming a set packing.

ichantz
January 12, 2018 16:17 PM

I can't think of any algorithm that computes directly what you want without searching the solution space.

However, you want better performance than your complex recursive algorithm.

I think that can be had by using a non-recursive approach.

1. a Map, MatchesToMembers, taking us from Match # to Set of Member #s

Let's add to that a complementary structure for reverse lookup:

1. a Map, MembersToMatches, taking us from Member # to Match #'s, initially empty

And a data structure of possible match grouping candidates, which will be

1. a Map, Candidates, taking us from Candidate # to Boolean,
• initially all true
• sized of 2^(# Matches)

Let's populate this new data structure (#2):

``````for each Match# as Key in MatchesToMembers
for each Member# in MatchesToMembers[Match#]
Update MembersToMatches[Member#] to include Match#
next
next
``````

We now have (#2) MembersToMatches as follows:

``````A1: S1, S3
A3: S2, S3
B2: S1, S2
B4: S3
B5: S3
``````

Let's populate the Candidate data structure (#3) from the other (#2):

``````for each MatchSet as Value in MembersToMatches
if MatchSet has more than one member then
Assemble a Candidate mask from the set of Match#'s
by accumlating (1<<Match#) for each Match# in the MatchSet
Reject all Candidates at that mask
``````

And (#3) Candidates as follows

``````0 true  (the empty candidate)
1 true  (candidate having match#1 only)
2 true  (candidate having match#2 only)
3 false (candidate having match1&2)
4 true  (candidate having match#3 only)
5 false (candidate having match#1&3)
6 false (candidate having match#2&3)
7 false (candidate having all matches)
``````

Hopefully you can see that we're using a bit values of the solution space for the Key in the Candiates map. E.g. Candidate #5 (it's index/Key) in binary is 101, meaning consider the solution of taking Set 3 and Set 1 (in your 1-based set #'ering).

Now we compute the score for each non-rejected candidate, and take the max.

Rejection from candidate mask as all the elements in the set are known to they interfere with each other:

``````for each index as Key in Candidate
Candidate[index] = false
endif
next
``````

Max computation

``````for index as Key in Candidate
if Candidate[index] then
compute score for Candidate
if larger than before, then keep it
endif
next
``````

if your Candidate#'s are small (e.g. less than 32), then you can do this with a simple `for (int index = 0; index < count; index++) { }`, however, for a larger set you'd need a bit vector for the index variable (and the ability to "increment" a bit vector).

Also, these algorithms & data structures can be used with bit vectors instead of individual elements or sets or other. if we work on bit vectors we can work on chunks of 64-bits at at time reducing the iterations by a large factor.

Erik Eidt
January 13, 2018 21:02 PM