A module for Set Operations

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
Post Reply
Median Joe
Posts: 3
Joined: Aug 04, 2021 6:15

A module for Set Operations

Post by Median Joe »

Back in the day I used Turbo Pascal, but having recently begun programming again, I opted for FB because of it's less fussy syntax and easy to use graphics. But one thing I miss about Pascal is the built-in set functions, so I wrote my own. There are several ways to implement the Set ADT, but an efficient and simple approach is to use boolean arrays (bit vectors). Anyway, I thought I would share the code in the event that it is useful for some. I never used QBasic and am still learning FB, so any suggestions for improvements are welcomed. I'm not aware of any existing module for these routines, although I did find this on rosettacode.com:

http://www.rosettacode.org/wiki/Set#FreeBASIC

But the bit vector way is superior IMO.

Code: Select all

'Set Routines

'clear/initialise set
SUB set_clear(s() AS boolean)
  DIM i AS integer
  FOR i = lbound(s) TO ubound(s) 
    s(i) = false
  NEXT i
END SUB

'add element to a set
SUB set_incl(n AS integer, s() AS boolean)
  s(n) = true  
END SUB

'remove element from set
SUB set_excl(n AS integer, s() AS boolean)
  s(n) = false
END SUB

'Is an element in the set?
FUNCTION set_in(n AS integer, s() AS boolean) AS boolean
  set_in = s(n) 
END FUNCTION

'cardinality (return number of elements in set)
FUNCTION set_card(s() AS boolean) AS integer
  DIM n AS integer,i AS integer
  n = 0
  FOR i = lbound(s) TO ubound(s) 
    IF s(i) = true THEN
    n += 1 
    END IF 
  NEXT i
  set_card = n
END FUNCTION

'intersection
SUB set_inter(s1() AS boolean,s2() AS boolean,result() AS boolean)
  DIM i AS integer
  FOR i = lbound(s1) TO ubound(s1) 
    IF s1(i) and s2(i) THEN
      result(i) = true
    END IF 
  NEXT i  
END SUB

'union
SUB set_union(s1() AS boolean,s2() AS boolean,result() AS boolean)
  DIM i AS integer
  FOR i = lbound(s1) TO ubound(s1) 
    IF s1(i) or s2(i) THEN
      result(i) = true
    END IF 
  NEXT i  
END SUB

'difference (s1 - s2)
SUB set_diff(s1() AS boolean,s2() AS boolean,result() AS boolean)
  DIM i AS integer
  FOR i = lbound(s1) TO ubound(s1) 
    IF s1(i) and not s2(i) THEN
      result(i) = true
    END IF
  NEXT i  
END SUB

'symmetric difference (result is union(s1,s2) - inter(s1,s2))
SUB set_symdiff(s1() AS boolean,s2() AS boolean,result() AS boolean)
  DIM i AS integer
  FOR i = lbound(s1) TO ubound(s1) 
    IF s1(i) xor s2(i) THEN
      result(i) = true
    END IF
  NEXT i  
END SUB

' are the sets equal?
FUNCTION set_isequal(s1() AS boolean,s2() AS boolean) AS boolean
  DIM i AS integer
  DIM equal AS boolean
  equal = true
  FOR i = lbound(s1) TO ubound(s1) 
    IF not(s1(i) eqv s2(i)) THEN
      equal = false
      EXIT FOR
    END IF
  NEXT i
  set_isequal = equal 
END FUNCTION 

'Is set s1 a subset of set s2?
FUNCTION set_issubset(s1() AS boolean,s2() AS boolean) AS boolean
  DIM subset AS boolean
  DIM i AS integer
  subset = true
  FOR i = lbound(s1) TO ubound(s1) 
    IF not(s1(i) imp s2(i)) THEN
      subset = false
      EXIT FOR 
    END IF
  NEXT i
  set_issubset = subset
END FUNCTION

'show set
SUB set_show(s() AS boolean)
  DIM i AS integer
  FOR i = lbound(s) TO ubound(s) 
    IF s(i) = true THEN
      print i;
    END IF
  NEXT i
  print
END SUB
Here is some demo code :

Code: Select all

#include "sets.bi"
const maxset = 10

DIM A(maxset) AS boolean 
DIM B(maxset) AS boolean
DIM C(maxset) AS boolean
DIM result(maxset) AS boolean
DIM i AS integer

' A = {1,2,3,4,5,6,7}
' B = {4,5,6}
' C = {6,7,8,9}
' fill sets 
FOR i = 1 TO 7: set_incl(i,A()): NEXT i
FOR i = 4 TO 6: set_incl(i,B()): NEXT i
FOR i = 6 TO 9: set_incl(i,C()): NEXT i

'show sets
print "A: ";
set_show(A())
print "B: ";
set_show(B())
print "C: ";
set_show(C())

'intersection
print "Intersection of A & B: ";
set_inter(A(),B(),result())
set_show(result())

'union
print "Union of B & C: ";
set_union(B(),C(),result())
set_show(result())

'difference between B & C (B - C)
set_clear(result())
print "Difference between C & A (C - A): ";
set_diff(C(),A(),result())
set_show(result())

'symmetric difference between A & C
set_clear(result())
print "Symmetric difference between A & C: ";
set_symdiff(A(),C(),result())
set_show(result())

'subset; is C a subset of A?
print "Is C a subset of A? ";set_issubset(C(),A()) 
print "Is B a subset of A? ";set_issubset(B(),A()) 
' how many elements in A?
print "Set A has";set_card(A());" elements"
' remove elements 1..5 from A
FOR i = 1 TO 5
  set_excl(i,A())
NEXT i 
print "Removing elements 1..5 from A..."
print "A: ";
set_show(A())
' add elements 8,9 to A
print "Adding elements 8 & 9 to A..."
print "A: ";
set_incl(8,A())
set_incl(9,A())
set_show(A())
'equal sets
print "Does A = C?: ";set_isequal(A(),C())
print "Does A = B?: ";set_isequal(A(),B())
And many thanks to the devs for creating and maintaining FB!
thrive4
Posts: 72
Joined: Jun 25, 2021 15:32

Re: A module for Set Operations

Post by thrive4 »

@median joe

Thanks for sharing the code!
I'm struggling a bit with applying set(s)
could you possibly give some examples / reasons
where applying sets really shines application wise?

Not to come empty handed here are some sources
all though I must say most explain how to use sets
but not really why.... readability and forms of
optimization are mentioned but I'm still a bit on
the fence but never the less intrigued.

sql or in large datasets makes sense:
https://www.tutorialspoint.com/sql_cert ... rators.htm

examples with java:
https://www.codejava.net/java-core/coll ... d-examples

math (venn diagrams):
https://www.cuemath.com/algebra/operations-on-sets/

python (quite an elaborate native implementation):
https://www.programiz.com/python-programming/set
https://www.pythoncheatsheet.org/blog/p ... t-why-how/
Median Joe
Posts: 3
Joined: Aug 04, 2021 6:15

Re: A module for Set Operations

Post by Median Joe »

Hi thrive4,

Sets are one of those data structures where it's hard to come up with concrete examples because they are everywhere. Although, limiting sets to integers does limit the applications somewhat. You mentioned SQL, and sets play a big part in databases, but they are also very useful when you don't need or want a full-blown database, for filtering data according to certain criteria.

They are also useful as flags. e.g. you might want to keep track of some conditions in a program, such as a game. Suppose you have a game with the following options:

Code: Select all

enum
  beginner
  intermediate 
  expert
  mouse
  sound
end enum
Put these into a set and when the user makes a selection or toggles an option the corresponding set elements are added or removed by the program, using set_incl() and set_excl() respectively, and appropriate actions can be taken according to the options selected. e.g. IF set_in(beginner, Options()) THEN do stuff... and of course you can use other set operations such as intersection to make decisions depending on whether certain combinations of options are on or off.
All kinds of states can be tracked like this; exception conditions, attributes of collections of data, font styles (bold, italic, underlined), etc.

Also, using sets can sometimes shorten the code and make it more efficient. For example, suppose you want to take some action if a number is between two ranges, say between 5 to 10 or 25 to 30. The standard way to do this would be (IF n >= 5 AND n <= 10) OR (n >= 25 AND n <= 30). By constructing the appropriate set "ranges", you need only write IF set_in(n, ranges()). Much simpler and more readable.

I haven't done this, but it would be easy to add routines to the module which will manipulate sets of characters (using the Asc() function). This might be useful if you're doing text processing. You could create sets for vowels, consonants, punctuation symbols, etc.
marcov
Posts: 3456
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: A module for Set Operations

Post by marcov »

Sets are like bitfields where each value in an ordinal (integer or enum) gets a bit, and the various operations are similar (and implemented as) to and/or/xor. Most current Pascal implementations are limited to up to 256 elements (32-byte bitfield). Small sets up to 32-bit can be kept in registers for operations so are quite efficient. A class wrapper TBits exists for larger sets (but dynamically allocated and not typesafe). Recently also a generic for static and typesafe larger sets emerged)
https://www.freepascal.org/docs-html/3.0.2/ref/refsu49.html wrote:As can be seen, the union is equivalent to a binary OR, while the intersection is equivalent to a binary AND, and the symmetric difference equals a XOR operation.
Similarly the IN operator is basically testing if a certain bit is set. Typically they are used as complement to enums to manipulate lists of enums, e.g. in state machines.

The concept is nice, specially for lowlevel work and statemachines, but current implementations suffer from old limitations, like said 256 element limitation and the fact that a set of 10-..40 is 41-bit, even if bits 0..9 are unused per definition. I'm also not sure if a 64-bit set is registrable on 64-bit architectures. It also leans on typesafe enums.

p.s. the FPC documentation link from the quote shows typical use in Pascal. Turbo Pascal was the same, except it didn't have >< and maybe also not <=
Median Joe
Posts: 3
Joined: Aug 04, 2021 6:15

Re: A module for Set Operations

Post by Median Joe »

Another situation where sets come in handy, which I forgot to mention, is when you need to find the unique elements or numbers in an array. I'll show a rather contrived example to illustrate the idea, which uses the property of sets that repeating elements don't count towards their cardinality (number of elements in the set). So, e.g. set_card({1,2,2,3,3,3,4}) = set_card({1,2,3,4}) = 4.

In the casino game Roulette there are 37 pockets on the wheel, numbered 0 to 36. You want to know 1) how many unique numbers will appear after 37 spins, and 2) how many spins are needed to "collect" all 37 numbers. In both these examples for a meaningful result you would need to execute each loop a large number of times and take the average, but that's irrelevant to the concept being illustrated. This is nice example of how the set concept can solve a problem which otherwise would take considerably more code, and the solution to which isn't immediately obvious.

Code: Select all

#include "sets.bi"

DIM wheel(37) AS boolean
DIM i AS integer,spin AS integer,n AS integer

randomize
' how many unique numbers after 37 spins of the wheel?
FOR i = 1 TO 37
  spin = rnd * 36
  set_incl(spin,wheel())
NEXT i
print "Number of uniques after 37 spins"; set_card(wheel())

' how many spins does it take to collect all 37 numbers?
set_clear(wheel())
n = 0
DO 
  spin = rnd * 36
  n += 1
  set_incl(spin,wheel())
LOOP UNTIL set_card(wheel()) = 37  
print "Number of spins to collect all numbers: "; n
marcov
Posts: 3456
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: A module for Set Operations

Post by marcov »

Note that sets can have multiple forms. There are (at least) two defining aspects. Knowing the maximum number of elements up front (i.e. compiletime) and, if the number is large, sets with a fixed, known up front number of item (like set of enum), and if you use a list of items (aka sparsely stored set) or a bitfield.

The sparse sets are more memory saving and even faster for very large or unknown number of elements. Bitfields are very efficient for small(ish) number elements, and often map good to primitives as popcnt.

I've heard the term open and closed sets for resp sets without and with a fixed upper bound (number of elements)

The post thrive4 cited comes up with another aspect: deterministic (ordered) iteration.
Last edited by marcov on Jul 16, 2022 15:15, edited 3 times in total.
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: A module for Set Operations

Post by dodicat »

My method for Roulette:

Code: Select all


#macro cleanup(a,b)
scope
Redim  b(lbound(a) to ubound(a))
b(lbound(b))=a(lbound(a))
    Dim As Long flag,count=1
    For n1 As Long=Lbound(a)+1 To Ubound(a)
        flag=0
        For n2 As Long=lbound(a) to  n1-1
            If a(n1)=a(n2) Then flag=1:Exit For
        Next n2
        If flag=0 Then
            b(lbound(b)+count)=a(n1)
            count+=1
        End If
    Next n1
    Redim Preserve b(Lbound(a) to count+Lbound(a)-1)
End Scope
#endmacro


DIM i AS integer,spin AS integer,n AS integer

randomize 
dim as integer wheel(37)
redim as integer unique()
' how many unique numbers after 37 spins of the wheel?

FOR i = 0 TO 37 
  spin = rnd * 36
  wheel(i)=spin
NEXT i

cleanup(wheel,unique)

print "Number of uniques after 37 spins"; ubound(unique)-lbound(unique)+1 '(number of elements)


'now clear wheel to user default safe value to start afresh
for i=0 to 37
      wheel(i)=-100
next
' how many spins does it take to collect all 37 numbers?
n = 0
DO 
  spin = rnd * 36
  n += 1
  wheel(spin)=spin
  cleanup(wheel,unique)
LOOP UNTIL ubound(unique) = 37 'unique(0 to 37)
print "Number of spins to collect all numbers: "; n
sleep 
Note I use the whole array 0 to 37, fill it up to get Number of uniques at the beginning.
After that, same as you.
thrive4
Posts: 72
Joined: Jun 25, 2021 15:32

Re: A module for Set Operations

Post by thrive4 »

@median joe

Thank you for the insightful examples.
I ended up using the pocket edition (embedded) of python:
https://www.python.org/downloads/windows/
Just to get some feel with 'sets'.

It does seem useful, specifically when being
able to use a variety of variable types as well
when being able to get specific elements in
the set or determine the number of elements
in the set, etc, how ever we are not quite there yet
though maybe in due time.

This python code illustrates an implementation:

Code: Select all

# Different types of sets in Python
# set of integers
my_set = {1, 2, 3}
print(my_set)

# set of mixed datatypes
print ("----------")
my_set = {1.0, "Hello", (1, 2, 3)}
print(my_set)

seta = {"house", "window", "door"}
setb = {"house", "window", "door", "chimney"}

print ("----------")
print (seta)
print (setb)

# communal aka intersection
print ("----------")
print ("communal " + str((seta & setb)))

# difference
print ("----------")
print ("difference" + str((setb - seta)))

# union
print ("----------")
print ("union" + str((setb | seta)))

# print individual elements in set
print ("----------")
for i in setb:
    print(i)

# retrieve <n> element
print ("----------")
print(list(setb)[0])
print(list(setb)[1])

# number of elements in a set
print ("----------")
print(len(seta))
print(len(setb))
Generic Pascal implementation:
https://www.tutorialspoint.com/pascal/pascal_sets.htm

debate on implementing a set data structure in C:
https://stackoverflow.com/questions/263 ... -structure
a bit in line with the remarks of @marcov
marcov
Posts: 3456
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: A module for Set Operations

Post by marcov »

thrive4 wrote: Jul 16, 2022 12:53 debate on implementing a set data structure in C:
https://stackoverflow.com/questions/263 ... -structure
a bit in line with the remarks of @marcov
That link also mentioned ordered vs unordered iteration as key aspect. Added to my post.
Post Reply