Abrufen eines Power-Sets eines Sets in Java

1. Einleitung

In diesem Tutorial untersuchen wir den Prozess des Generierens eines Potenzsatzes eines bestimmten Satzes in Java.

Zur schnellen Erinnerung gibt es für jeden Satz der Größe n einen Leistungssatz der Größe 2n . Wir werden lernen, wie man es mit verschiedenen Techniken bekommt.

2. Definition eines Power Sets

Die Potenzmenge einer gegebenen Menge S ist die Menge aller Teilmengen von S , einschließlich S selbst und der leeren Menge.

Zum Beispiel für einen bestimmten Satz:

{"APPLE", "ORANGE", "MANGO"}

Das eingestellte Netz ist:

{ {}, {"APPLE"}, {"ORANGE"}, {"APPLE", "ORANGE"}, {"MANGO"}, {"APPLE", "MANGO"}, {"ORANGE", "MANGO"}, {"APPLE", "ORANGE", "MANGO"} }

Da es sich auch um eine Reihe von Teilmengen handelt, ist die Reihenfolge der internen Teilmengen nicht wichtig und sie können in beliebiger Reihenfolge angezeigt werden:

{ {}, {"MANGO"}, {"ORANGE"}, {"ORANGE", "MANGO"}, {"APPLE"}, {"APPLE", "MANGO"}, {"APPLE", "ORANGE"}, {"APPLE", "ORANGE", "MANGO"} }

3. Guavenbibliothek

Die Google Guava-Bibliothek verfügt über einige nützliche Set- Dienstprogramme, z. B. das Power Set. Somit können wir es leicht verwenden, um auch die Potenzmenge der gegebenen Menge zu erhalten:

@Test public void givenSet_WhenGuavaLibraryGeneratePowerSet_ThenItContainsAllSubsets() { ImmutableSet set = ImmutableSet.of("APPLE", "ORANGE", "MANGO"); Set
    
      powerSet = Sets.powerSet(set); Assertions.assertEquals((1 << set.size()), powerSet.size()); MatcherAssert.assertThat(powerSet, Matchers.containsInAnyOrder( ImmutableSet.of(), ImmutableSet.of("APPLE"), ImmutableSet.of("ORANGE"), ImmutableSet.of("APPLE", "ORANGE"), ImmutableSet.of("MANGO"), ImmutableSet.of("APPLE", "MANGO"), ImmutableSet.of("ORANGE", "MANGO"), ImmutableSet.of("APPLE", "ORANGE", "MANGO") )); }
    

Guava powerSet arbeitet intern über die Iterator- Schnittstelle so, wie wenn die nächste Teilmenge angefordert wird, die Teilmenge berechnet und zurückgegeben wird. Die Raumkomplexität wird also auf O (n) anstelle von O (2n) reduziert .

Aber wie erreicht Guave dies?

4. Ansatz zur Stromerzeugungsgenerierung

4.1. Algorithmus

Lassen Sie uns nun die möglichen Schritte zum Erstellen eines Algorithmus für diese Operation diskutieren.

Die Potenzmenge einer leeren Menge ist {{}}, in der sie nur eine leere Menge enthält. Das ist also unser einfachster Fall.

Für jede andere Menge S als die leere Menge extrahieren wir zuerst ein Element und nennen es - Element . Dann wird für den Rest der Elemente einer Menge subsetWithoutElement berechnen wir ihre Potenzmenge rekursiv - und nennen Sie es so etwas wie Powerset S ubsetWithoutElement . Dann wird durch das Hinzufügen von extrahierter Elemente zu allen Sätzen in Powerset S ubsetWithoutElement , wir bekommen Powerset S ubsetWithElement.

Die Potenzmenge S ist nun die Vereinigung eines powerSetSubsetWithoutElement und eines powerSetSubsetWithElement :

Sehen wir uns ein Beispiel für den rekursiven Potenzsatzstapel für den angegebenen Satz {"APPLE", "ORANGE", "MANGO"} an .

Um die Lesbarkeit des Bildes zu verbessern, verwenden wir Kurzformen von Namen: P bedeutet Potenzsatzfunktion und "A", "O", "M" sind Kurzformen von "APPLE", "ORANGE" bzw. "MANGO" :

4.2. Implementierung

Schreiben wir also zuerst den Java-Code zum Extrahieren eines Elements und erhalten die verbleibenden Teilmengen:

T element = set.iterator().next(); Set subsetWithoutElement = new HashSet(); for (T s : set) { if (!s.equals(element)) { subsetWithoutElement.add(s); } }

Wir wollen dann das Powerset von subsetWithoutElement erhalten :

Set
    
      powersetSubSetWithoutElement = recursivePowerSet(subsetWithoutElement);
    

Als nächstes müssen wir dieses Powerset wieder in das Original einfügen:

Set
    
      powersetSubSetWithElement = new HashSet(); for (Set subsetWithoutElement : powerSetSubSetWithoutElement) { Set subsetWithElement = new HashSet(subsetWithoutElement); subsetWithElement.add(element); powerSetSubSetWithElement.add(subsetWithElement); }
    

Schließlich ist die Vereinigung von powerSetSubSetWithoutElement und powerSetSubSetWithElement der Potenzsatz des angegebenen Eingabesatzes:

Set
    
      powerSet = new HashSet(); powerSet.addAll(powerSetSubSetWithoutElement); powerSet.addAll(powerSetSubSetWithElement);
    

Wenn wir alle unsere Codefragmente zusammenfügen, können wir unser Endprodukt sehen:

public Set
    
      recursivePowerSet(Set set) { if (set.isEmpty()) { Set
     
       ret = new HashSet(); ret.add(set); return ret; } T element = set.iterator().next(); Set subSetWithoutElement = getSubSetWithoutElement(set, element); Set
      
        powerSetSubSetWithoutElement = recursivePowerSet(subSetWithoutElement); Set
       
         powerSetSubSetWithElement = addElementToAll(powerSetSubSetWithoutElement, element); Set
        
          powerSet = new HashSet(); powerSet.addAll(powerSetSubSetWithoutElement); powerSet.addAll(powerSetSubSetWithElement); return powerSet; } 
        
       
      
     
    

4.3. Hinweise für Unit-Tests

Jetzt lass uns testen. Wir haben hier einige Kriterien, um dies zu bestätigen:

  • Zuerst überprüfen wir die Größe des Leistungssatzes und er muss für einen Satz mit der Größe n 2n betragen .
  • Dann kommt jedes Element nur einmal in einer Teilmenge und 2n-1 verschiedenen Teilmengen vor.
  • Schließlich muss jede Teilmenge einmal erscheinen.

If all these conditions passed, we can be sure that our function works. Now, since we've used Set, we already know that there's no repetition. In that case, we only need to check the size of the power set, and the number of occurrences of each element in the subsets.

To check the size of the power set we can use:

MatcherAssert.assertThat(powerSet, IsCollectionWithSize.hasSize((1 << set.size())));

And to check the number of occurrences of each element:

Map counter = new HashMap(); for (Set subset : powerSet) { for (String name : subset) { int num = counter.getOrDefault(name, 0); counter.put(name, num + 1); } } counter.forEach((k, v) -> Assertions.assertEquals((1 << (set.size() - 1)), v.intValue()));

Finally, if we can put all together into one unit test:

@Test public void givenSet_WhenPowerSetIsCalculated_ThenItContainsAllSubsets() { Set set = RandomSetOfStringGenerator.generateRandomSet(); Set
    
      powerSet = new PowerSet().recursivePowerSet(set); MatcherAssert.assertThat(powerSet, IsCollectionWithSize.hasSize((1 << set.size()))); Map counter = new HashMap(); for (Set subset : powerSet) { for (String name : subset) { int num = counter.getOrDefault(name, 0); counter.put(name, num + 1); } } counter.forEach((k, v) -> Assertions.assertEquals((1 << (set.size() - 1)), v.intValue())); }
    

5. Optimization

In this section, we'll try to minimize the space and reduce the number of internal operations to calculate the power set in an optimal way.

5.1. Data Structure

As we can see in the given approach, we need a lot of subtractions in the recursive call, which consumes a large amount of time and memory.

Instead, we can map each set or subset to some other notions to reduce the number of operations.

First, we need to assign an increasing number starting from 0 to each object in the given set S which means we work with an ordered list of numbers.

For example for the given set {“APPLE”, “ORANGE”, “MANGO”} we get:

“APPLE” -> 0

“ORANGE” ->

“MANGO” -> 2

So, from now on, instead of generating subsets of S, we generate them for the ordered list of [0, 1, 2], and as it is ordered we can simulate subtractions by a starting index.

For example, if the starting index is 1 it means that we generate the power set of [1,2].

To retrieve mapped id from the object and vice-versa, we store both sides of mapping. Using our example, we store both (“MANGO” -> 2) and (2 -> “MANGO”). As the mapping of numbers started from zero, so for the reverse map there we can use a simple array to retrieve the respective object.

One of the possible implementations of this function would be:

private Map map = new HashMap(); private List reverseMap = new ArrayList(); private void initializeMap(Collection collection) { int mapId = 0; for (T c : collection) { map.put(c, mapId++); reverseMap.add(c); } }

Now, to represent subsets there are two well-known ideas:

  1. Index representation
  2. Binary representation

5.2. Index Representation

Each subset is represented by the index of its values. For example, the index mapping of the given set {“APPLE”, “ORANGE”, “MANGO”} would be:

{ {} -> {} [0] -> {"APPLE"} [1] -> {"ORANGE"} [0,1] -> {"APPLE", "ORANGE"} [2] -> {"MANGO"} [0,2] -> {"APPLE", "MANGO"} [1,2] -> {"ORANGE", "MANGO"} [0,1,2] -> {"APPLE", "ORANGE", "MANGO"} }

So, we can retrieve the respective set from a subset of indices with the given mapping:

private Set
    
      unMapIndex(Set
     
       sets) { Set
      
        ret = new HashSet(); for (Set s : sets) { HashSet subset = new HashSet(); for (Integer i : s) { subset.add(reverseMap.get(i)); } ret.add(subset); } return ret; }
      
     
    

5.3. Binary Representation

Or, we can represent each subset using binary. If an element of the actual set exists in this subset its respective value is 1; otherwise it is 0.

For our fruit example, the power set would be:

{ [0,0,0] -> {} [1,0,0] -> {"APPLE"} [0,1,0] -> {"ORANGE"} [1,1,0] -> {"APPLE", "ORANGE"} [0,0,1] -> {"MANGO"} [1,0,1] -> {"APPLE", "MANGO"} [0,1,1] -> {"ORANGE", "MANGO"} [1,1,1] -> {"APPLE", "ORANGE", "MANGO"} }

So, we can retrieve the respective set from a binary subset with the given mapping:

private Set
    
      unMapBinary(Collection
     
       sets) { Set
      
        ret = new HashSet(); for (List s : sets) { HashSet subset = new HashSet(); for (int i = 0; i < s.size(); i++) { if (s.get(i)) { subset.add(reverseMap.get(i)); } } ret.add(subset); } return ret; }
      
     
    

5.4. Recursive Algorithm Implementation

In this step, we'll try to implement the previous code using both data structures.

Before calling one of these functions, we need to call the initializeMap method to get the ordered list. Also, after creating our data structure, we need to call the respective unMap function to retrieve the actual objects:

public Set
    
      recursivePowerSetIndexRepresentation(Collection set) { initializeMap(set); Set
     
       powerSetIndices = recursivePowerSetIndexRepresentation(0, set.size()); return unMapIndex(powerSetIndices); }
     
    

So, let's try our hand at the index representation:

private Set
    
      recursivePowerSetIndexRepresentation(int idx, int n) { if (idx == n) { Set
     
       empty = new HashSet(); empty.add(new HashSet()); return empty; } Set
      
        powerSetSubset = recursivePowerSetIndexRepresentation(idx + 1, n); Set
       
         powerSet = new HashSet(powerSetSubset); for (Set s : powerSetSubset) { HashSet subSetIdxInclusive = new HashSet(s); subSetIdxInclusive.add(idx); powerSet.add(subSetIdxInclusive); } return powerSet; }
       
      
     
    

Now, let's see the binary approach:

private Set
    
      recursivePowerSetBinaryRepresentation(int idx, int n) { if (idx == n) { Set
     
       powerSetOfEmptySet = new HashSet(); powerSetOfEmptySet.add(Arrays.asList(new Boolean[n])); return powerSetOfEmptySet; } Set
      
        powerSetSubset = recursivePowerSetBinaryRepresentation(idx + 1, n); Set
       
         powerSet = new HashSet(); for (List s : powerSetSubset) { List subSetIdxExclusive = new ArrayList(s); subSetIdxExclusive.set(idx, false); powerSet.add(subSetIdxExclusive); List subSetIdxInclusive = new ArrayList(s); subSetIdxInclusive.set(idx, true); powerSet.add(subSetIdxInclusive); } return powerSet; }
       
      
     
    

5.5. Iterate Through [0, 2n)

Now, there is a nice optimization we can do with the binary representation. If we look at it, we can see that each row is equivalent to the binary format of a number in [0, 2n).

So, if we iterate through numbers from 0 to 2n, we can convert that index to binary, and use it to create a boolean representation of each subset:

private List
    
      iterativePowerSetByLoopOverNumbers(int n) { List
     
       powerSet = new ArrayList(); for (int i = 0; i < (1 << n); i++) { List subset = new ArrayList(n); for (int j = 0; j < n; j++) subset.add(((1 < 0); powerSet.add(subset); } return powerSet; }
     
    

5.6. Minimal Change Subsets by Gray Code

Now, if we define any bijective function from binary representation of length n to a number in [0, 2n), we can generate subsets in any order that we want.

Gray Code is a well-known function that is used to generate binary representations of numbers so that the binary representation of consecutive numbers differ by only one bit (even the difference of the last and first numbers is one).

We can thus optimize this just a bit further:

private List
    
      iterativePowerSetByLoopOverNumbersWithGrayCodeOrder(int n) { List
     
       powerSet = new ArrayList(); for (int i = 0; i < (1 << n); i++) { List subset = new ArrayList(n); for (int j = 0; j 
      
       > 1); subset.add(((1 < 0); } powerSet.add(subset); } return powerSet; }
      
     
    

6. Lazy Loading

To minimize the space usage of power set, which is O(2n), we can utilize the Iterator interface to fetch every subset, and also every element in each subset lazily.

6.1. ListIterator

First, to be able to iterate from 0 to 2n, we should have a special Iterator that loops over this range but not consuming the whole range beforehand.

To solve this problem, we'll use two variables; one for the size, which is 2n, and another for the current subset index. Our hasNext() function will check that position is less than size:

abstract class ListIterator implements Iterator { protected int position = 0; private int size; public ListIterator(int size) { this.size = size; } @Override public boolean hasNext() { return position < size; } }

And our next() function returns the subset for the current position and increases the value of position by one:

@Override public Set next() { return new Subset(map, reverseMap, position++); }

6.2. Subset

To have a lazy load Subset, we define a class that extends AbstractSet, and we override some of its functions.

By looping over all bits that are 1 in the receiving mask (or position) of the Subset, we can implement the Iterator and other methods in AbstractSet.

For example, the size() is the number of 1s in the receiving mask:

@Override public int size() { return Integer.bitCount(mask); }

And the contains() function is just whether the respective bit in the mask is 1 or not:

@Override public boolean contains(@Nullable Object o) { Integer index = map.get(o); return index != null && (mask & (1 << index)) != 0; }

We use another variable – remainingSetBits – to modify it whenever we retrieve its respective element in the subset we change that bit to 0. Then, the hasNext() checks if remainingSetBits is not zero (that is, it has at least one bit with a value of 1):

@Override public boolean hasNext() { return remainingSetBits != 0; }

And the next() function uses the right-most 1 in the remainingSetBits, then converts it to 0, and also returns the respective element:

@Override public E next() { int index = Integer.numberOfTrailingZeros(remainingSetBits); if (index == 32) { throw new NoSuchElementException(); } remainingSetBits &= ~(1 << index); return reverseMap.get(index); }

6.3. PowerSet

To have a lazy-load PowerSet class, we need a class that extends AbstractSet .

The size() function is simply 2 to the power of the set's size:

@Override public int size() { return (1 << this.set.size()); }

As the power set will contain all possible subsets of the input set, so contains(Object o) function checks if all elements of the object o are existing in the reverseMap (or in the input set):

@Override public boolean contains(@Nullable Object obj) { if (obj instanceof Set) { Set set = (Set) obj; return reverseMap.containsAll(set); } return false; }

To check equality of a given Object with this class, we can only check if the input set is equal to the given Object:

@Override public boolean equals(@Nullable Object obj) { if (obj instanceof PowerSet) { PowerSet that = (PowerSet) obj; return set.equals(that.set); } return super.equals(obj); }

The iterator() function returns an instance of ListIterator that we defined already:

@Override public Iterator
    
      iterator() { return new ListIterator
     
      (this.size()) { @Override public Set next() { return new Subset(map, reverseMap, position++); } }; }
     
    

The Guava library uses this lazy-load idea and these PowerSet and Subset are the equivalent implementations of the Guava library.

For more information, check their source code and documentation.

Furthermore, if we want to do parallel operation over subsets in PowerSet, we can call Subset for different values in a ThreadPool.

7. Summary

To sum up, first, we studied what is a power set. Then, we generated it by using the Guava Library. After that, we studied the approach and how we should implement it, and also how to write a unit test for it.

Finally, we utilized the Iterator interface to optimize the space of generation of subsets and also their internal elements.

Wie immer ist der Quellcode auf GitHub verfügbar.