QQ1703105484
The Set Abstract Data Type (ADT) is an ADT that can store unique elements. Generally, the Set ADT is defined by the following operations:
size
: Return the number of elements stored in the set
insert(x)
: Insert element x
into this set (don't allow duplicates)
erase(x)
: Remove element x
from this set (if it exists)
contains(x)
: Determine whether x
exists in this set
We can implement the Set ADT using various data structures we have already learned about, and we can even simply wrap around existing C++ classes:
We can use an Array List, which is implemented in the C++ vector class
We can use a Linked List, which is implemented in the C++ list class
We can use a Red-Black Tree (RBT), which is implemented in the C++ set class
We can use a Hash Table, which is implemented in the C++ unordered_set class
We can use a Multiway Trie (MWT), which is not natively implemented in C++
We can use a Ternary Search Tree (TST), which is not natively implemented in C++
ArrayListSet.cpp
In this part of the assignment, we have provided a file called ArrayListSet.cpp
that contains initial steps towards implementing the Set ADT using an Array List via the C++ vector class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code. Be sure to only modify ArrayListSet.cpp
. Do not modify any parts of the ArrayListSet
inside Set.h
.
LinkedListSet.cpp
In this part of the assignment, we have provided a file called LinkedListSet.cpp
that contains initial steps towards implementing the Set ADT using a Linked List via the C++ list class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code. Be sure to only modify LinkedListSet.cpp
. Do not modify any parts of the LinkedListSet
inside Set.h.
RedBlackTreeSet.cpp
In this part of the assignment, we have provided a file called RedBlackTreeSet.cpp
that contains initial steps towards implementing the Set ADT using a Red-Black Tree via the C++ set class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code. Be sure to only modify RedBlackTreeSet.cpp
. Do not modify any parts of the RedBlackTreeSet
inside Set.h
.
HashTableSet.cpp
In this part of the assignment, we have provided a file called HashTableSet.cpp
that contains initial steps towards implementing the Set ADT using a Hash Table via the C++ unordered_set class. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code. Be sure to only modify HashTableSet.cpp
. Do not modify any parts of the HashTableSet
inside Set.h
.
MultiwayTrieSet.cpp
Imagine we want to insert n elements of length k into our set. Array Lists, Linked Lists, and Red-Black Trees all scale as a function of n in the average and worst cases, and although Hash Tables are O(k) in the average case, they scale as a function of n in the worst case, and they have no inherent order. Instead, if we implement the Set ADT using a Multiway Trie, our find, insert, and remove operations will all be O(k) in the worst case, meaning our data structure's runtime will not worsen as n increases, and we can iterate over our elements in sorted order.
In this part of the assignment, we have provided a file called MultiwayTrieSet.cpp
that contains initial steps towards implementing the Set ADT using a Multiway Trie. Function headers (with usage details) are included in Set.h
. Your task is to fill in the missing code. Be sure to only modify MultiwayTrieSet.cpp
. You may modify Set.h
to add helper methods, but they must be private
. Global helper methods are not allowed. No additional member variables or public
/protected
methods are allowed.
We have provided a tester program, SetTest
, that will help you test your code. You can compile your code using the provided Makefile
via the make
command:
$ make g++ -Wall -pedantic -g -O0 -std=c++11 -o SetTest SetTest.cpp ArrayListSet.cpp HashTableSet.cpp LinkedListSet.cpp MultiwayTrieSet.cpp RedBlackTreeSet.cpp
If you want to clean up your environment by deleting all the compiled executables, you can simply run make clean
:
$ make clean rm -f SetTest *.o
Here's an example of how it should look like when it's run from the command line:
$ ./SetTest
SetTest.cpp
contains only one basic unit test for MultiwayTrieSet
. This file is only meant to give you an idea of how to write your own unit test files. You are highly encouraged to write your own unit test cases for all 5 Set implementations. You need to add as many unit tests as necessary to cover all possible cases. Although we'll not grade your test cases, it will help you find any bugs that may cause your submission to fail with our test cases.
As always, beware of memory leaks! You can use valgrind
to check for memory leaks. For example, you can run it as follows:
valgrind --tool=memcheck --leak-check=yes ./SetTest
If it gives you a report like the following, you do not have memory leaks in your test cases. We'll grade this part with all our hidden test cases so be sure to build your own test cases to help you find any potential memory leaks.
==1482== Memcheck, a memory error detector ==1482== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==1482== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==1482== Command: ./SetTest ==1482== ==1482== error calling PR_SET_PTRACER, vgdb might block ==1482== ==1482== HEAP SUMMARY: ==1482== in use at exit: 0 bytes in 0 blocks ==1482== total heap usage: 29,796 allocs, 29,796 frees, 1,455,627 bytes allocated ==1482== ==1482== All heap blocks were freed -- no leaks are possible ==1482== ==1482== For counts of detected and suppressed errors, rerun with: -v ==1482== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you do have memory leaks, the report will look something like the following:
==1516== Memcheck, a memory error detector ==1516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==1516== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==1516== Command: ./SetTest ==1516== ==1516== error calling PR_SET_PTRACER, vgdb might block ==1516== ==1516== HEAP SUMMARY: ==1516== in use at exit: 941,480 bytes in 24,559 blocks ==1516== total heap usage: 29,822 allocs, 5,263 frees, 1,457,307 bytes allocated ==1516== ==1516== 941,480 (64 direct, 941,416 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4 ==1516== at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1516== by 0x112912: MultiwayTrieSet::MultiwayTrieSet() (MultiwayTrieSet.cpp:7) ==1516== by 0x10A134: main (SetTest.cpp:22) ==1516== ==1516== LEAK SUMMARY: ==1516== definitely lost: 64 bytes in 1 blocks ==1516== indirectly lost: 941,416 bytes in 24,558 blocks ==1516== possibly lost: 0 bytes in 0 blocks ==1516== still reachable: 0 bytes in 0 blocks ==1516== suppressed: 0 bytes in 0 blocks ==1516== ==1516== For counts of detected and suppressed errors, rerun with: -v ==1516== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
After you have successfully completed the above tasks and have already submitted and graded your code, here are some optional challenges you can try out:
We asked you to implement the Set ADT by wrapping around existing C++ STL implementations of the Array List, Hash Table, Linked List, and Red-Black Tree data structures as well as by implementing your own Multiway Trie (MWT). We will not be requiring you to implement a Ternary Search Tree (TST) as a mandatory task, but if you want more of a challenge, implementing a TST can be quite tricky! Try to implement the Set ADT by implementing your own TST class.
All of the implementations of the Set ADT have the same functionality: the user can insert, find, and remove strings. However, the underlying data structure implementations vary, and each implementation has its respective trade-offs. Through simulation, randomly generate datasets and thoroughly benchmark the multiple Set ADT implementations you have developed.